Transformer models for channel state estimation and prediction in mmWave/THz bands under high user mobillity
DOI: 10.31673/2412-9070.2026.318118
Abstract
In the millimeter-wave (mmWave, 30-100 GHz) and terahertz (THz, 100 GHz-10 THz) frequency bands, which are considered a promising foundation for 5G-Advanced and 6G networks, the process of channel state information (CSI) estimation and prediction becomes significantly more challenging under high-mobility conditions, particularly for users moving at speeds exceeding 100 km/h. The primary factors contributing to this complexity include severe Doppler spread, rapid signal blockage, near-field effects in THz communications, limited channel coherence time (less than 1 ms), and the requirement for a large number of pilot symbols (pilot overhead reaching 30–50%). Traditional channel estimation approaches, including Least Squares (LS), Minimum Mean Square Error (MMSE), and even modern deep learning methods based on CNN and LSTM architectures, demonstrate insufficient performance in modeling long-term time-frequency dependencies and exhibit limited adaptability to diverse mobility scenarios.
This work proposes a hybrid CNN-Transformer model designed for joint channel estimation and short-term channel prediction in mmWave/THz massive MIMO systems. The proposed architecture combines a CNN-based frontend for local feature extraction from the pilot grid with a multi-head self-attention Transformer encoder operating in the time-frequency or delay-Doppler domain. In addition, the model incorporates physics-informed regularization (sparsity loss) to account for the geometric constraints of the wireless propagation environment.
The experimental evaluation was conducted using realistic DeepMIMO, QuaDRiGa, and RaymobTime datasets, including simulations of vehicular and UAV trajectories at speeds of up to 500 km/h. The obtained results demonstrate the superiority of the proposed approach over existing state-of-the-art methods. In particular, the model achieved an NMSE improvement of 4-8 dB, reduced pilot overhead by 30-48%, increased spectral efficiency to 11.4 bit/s/Hz, and enabled stable channel parameter prediction for 5-10 future slots, even in near-field THz scenarios. Furthermore, the optimized architecture provides practical feasibility for deployment on edge devices.
Keywords: channel estimation, channel prediction, Transformer, mmWave, THz, massive MIMO, 6G, deep learning, self-attention, pilot overhead.