I. Introduction
By deploying a large number of antennas at the base station (BS), massive multiple-input multiple-output (MIMO) is able to bring significant spectral efficiency gains when accurate knowledge of channel state information (CSI) is available at the BS [1]. To avoid the overwhelming CSI feedback overhead in frequency-division duplexing (FDD) systems, time-division duplexing (TDD) is widely studied. In a typical TDD system, the downlink (DL) CSI can be inferred by the estimated uplink (UL) CSI at the BS thanks to the channel reciprocity [2]. Due to limited channel coherence time and bandwidth, non-orthogonal pilot sequences are usually re-used for UL channel estimation in TDD systems, which results in pilot contamination and leads to significant performance degradation in massive MIMO [1].