Journals & Magazines >IEEE Internet of Things Journal >Volume: 10 Issue: 16

Toward Extra Large-Scale MIMO: New Channel Properties and Low-Cost Designs

Abstract:

Extra large-scale multiple-input–multiple-output (MIMO) has been recognized as one of the potential development directions of massive MIMO. By employing even more antenna...Show More

Metadata

Abstract:

Extra large-scale multiple-input–multiple-output (MIMO) has been recognized as one of the potential development directions of massive MIMO. By employing even more antennas than massive MIMO in the fifth-generation era, extra large-scale MIMO can further exploit the spatial domain resources and enable ultra-high data rates, low latency communications as well as emerging applications, such as sensing and localization, in sixth-generation mobile communication systems. However, with the increase of the size of the antenna array, and the decrease of the distance between a user and the array, new channel properties, that did not manifest in conventional massive MIMO, start to kick in. Most importantly, existing research strategies pertaining to massive MIMO cannot be directly applied or simply extended to fit the extra large-scale MIMO case. Moreover, increasing the number of antennas will inevitably boost the total cost, which refers to not only the high hardware cost, but also the burden of vast processing and computations as well as the substantial training overhead. In this article, we make a survey on the state-of-the-art on the new channel properties of and low-cost designs for extra large-scale MIMO systems. Particularly, we pursue a mathematical analysis to explain why the new features appear and illustrate how they affect the system model. Furthermore, we summarize and compare the low-cost designs from various perspectives and give our suggestions from a practical deployment point of view.

Published in: IEEE Internet of Things Journal ( Volume: 10, Issue: 16, 15 August 2023)

Page(s): 14569 - 14594

Date of Publication: 05 May 2023

ISSN Information:

DOI: 10.1109/JIOT.2023.3273328

Funding Agency:

Contents

CCBY - IEEE is not the copyright holder of this material. Please follow the instructions via https://creativecommons.org/licenses/by/4.0/ to obtain full-text articles and stipulations in the API documentation.

SECTION I.

Introduction

Massive multiple-input–multiple-output (MIMO), also named as large-scale MIMO, has been a successful enabler to boost the data transmission rate in mobile communication systems in the fifth-generation (5G) era [1], and will keep serving as an important physical layer technology in future mobile communication systems. By employing tens or hundreds of antennas at the base station (BS), massive MIMO produces high spatial resolution and supports multiuser transmission on the same time–frequency resources. As the number of BS antennas grows unconventionally large, the multiuser interference and the uncorrelated noise diminish [2], providing preferable conditions for multiuser transmission. In order to further harness the gain caused by using more antennas, the concept of extra large-scale MIMO has been proposed for the sixth-generation mobile communications [3], [4], [5]. Extra large-scale MIMO employs hundreds or even thousands of antennas at the BS to simultaneously provide service to a certain set of users, and is an augmented version of massive MIMO.

In practical implementations, there are two deployment types of an extra large number of antennas, including the centralized type and the distributed type. The centralized type is a direct extension of 5G large-aperture arrays, where all the antennas are uniformly deployed in a co-located fashion, while we can sustain the half-wavelength distance between two adjacent antennas, forming an extra-large aperture array [6]. Alternatively, the antennas can be confined within a predetermined area, resulting in the concept of holographic MIMO [7]. If the practical environment does not allow the deployment of such a large array, then we can distribute the antennas across multiple sites, corresponding to the distributed type [8]. Each site is equipped with a small amount of antennas. These sites jointly serve a same set of users. A typical example is cell-free massive MIMO [9]. In this article, we focus on the centralized type with an extra large-aperture array.

Compared with distributed systems, synchronization among antennas is much easier in centralized extra large-scale MIMO systems. Moreover, extra large-aperture arrays can cover the external walls of buildings in populated city centres or be employed at stadiums/airports to provide wireless communication services to a plethora of users. Therefore, in a centralized extra large-scale MIMO system, high beamforming gains can be harvested. Narrow beams with very low sidelobes can be generated by the extra large-aperture array and flexibly steered towards desired direction. Several orthogonal beams can be generated simultaneously, yielding an increase of the spatial–division multiplexing gain. Apart from satisfying the traditional requirements of high data rates, employing an extra large-aperture array enables new emerging applications. For instance, in indoor environments, such as in a factory, the autonomous driving of an electric car can be achieved by leveraging the high spatial resolution provided by such an array. There have been studies on extra large-aperture array-enabled new applications, including sensing and localization [10], [11], physical-layer security [12], [13], wireless energy transfer [14], [15], etc., as illustrated in Fig. 1.

Fig. 1.

Extra large-aperture arrays provide opportunities for high-rate data transmission, wireless energy transfer, physical-layer security, sensing, and localization.

Show All

Historically, the study of an emerging wireless architecture begins with the investigation of the propagation channel. Channel modeling of an extra large-aperture array system does not simply mean to expand the array size in a traditional MIMO channel model. With the increase of the array aperture, new channel properties kick in. First, the lower bound of the far field, known as the Rayleigh distance, is proportional to the array aperture. Considering that extra large-aperture arrays will generally be deployed in crowded urban or indoor factory environments, users will be close to the array. Different from traditional MIMO systems, where users are in the far field and signals experience plane wave propagation, in an extra large-aperture array system, there is a high probability that spherical waves will be created. Second, for users who are very close to the array, the pathloss between them fluctuates significantly across the array. If obstacles exist in the channel, then the channel power will be concentrated in a proportion of the array elements, known as the visibility region (VR). The spherical wave propagation and the existence of VR reflect the spatial nonstationarity of the channel. On the one hand, the new channel properties require new channel models for extra large-aperture array systems. On the other hand, these properties facilitate the above-mentioned new applications. Therefore, a deep and comprehensive study of the new channel properties is indispensable.

When translating a theoretical architecture into a commercial technology, the implementation and deployment costs are of pivotal importance. The employment of an extra large-aperture array entails the challenges of high hardware cost, high processing and computational complexity, and high training overhead. Regarding the hardware cost, a fully digital structure, where each active antenna is connected with a unique radio frequency (RF) chain, is unacceptably expensive when the number of active antennas grows large. Inspired by the low-cost designs in 5G millimeter wave systems, active antenna arrays with less RF chains can be adopted. Moreover, with the development of materials, extra large-aperture arrays can take the form of reconfigurable intelligent surfaces (RISs), which have the advantages of low cost and low power consumption. Therefore, the problem of high hardware cost can be tackled via different approaches.

In traditional MIMO systems with a limited number of antennas, signal processing, and computations are centralized at a common module, and the complexity is moderate. However, in an extra large-aperture array system, completely centralized processing and computations result in high complexity and are time consuming. In order to reduce the complexity and the processing latency, two approaches can be followed. One is to directly reduce the complexity of an algorithm in the centralized module. The other is to distribute the processing and computations to multiple local modules, thereby easing the burden in the centralized module. The distributed approach is more attractive, but the information exchange among the centralized module and the local modules affects the overall complexity and needs to be carefully assessed.

In a mobile communication system, an efficient transceiver design heavily depends on the precise knowledge of the wireless channel. The training overhead required to acquire the channel state information (CSI) usually increases with the number of antennas. Then, when an extra large-aperture array is deployed, the training overhead becomes substantial, which is evidently prohibitive for practical systems. Fortunately, the extra-large–dimensional channel shows directionality and sparsity in multiple domains. Traditional sparse channel estimation methods, such as compressed sensing, can be applied to reduce the training overhead. The directionality of a spherical wave channel further supports localization and sensing. Further, the existence of the VR enables overhead reduction among multiple users. The feasibility of low-overhead communication and sensing, together with low-cost architectures and low-complexity processing and computations, guarantee the practical implementation of an extra large-aperture array.

This article makes a comprehensive survey on the new channel properties and the low-cost designs of extra large-scale MIMO systems. Section II investigates the spherical wave propagation by analyzing the channel responses on a point, an antenna, and an array step by step, and provides guidance to the selection of channel models in different fields/regions. With the analytical results on spherical waves, Section III explains why the VR appears and investigates the existing categories of VR and their definitions and models. The spatial nonstationarity is verified theoretically and further taken into account in the subsequent low-cost designs in Sections IV–VI. The low-cost architectures with active antenna arrays and RISs are illustrated in Section IV. A comparison of the hardware cost, implementation and synchronization difficulties, and scalability of different architectures is provided. Then, the low-complexity processing and computation designs are introduced in Section V. Existing methods to reduce the complexity in centralized and distributed processing structures are summarized. Finally, the low-overhead communication and sensing based on the directionality and channel sparsity in the transformation domains are studied in Section VI.

Notations: We use letters in normal fonts, lowercase, and uppercase letters in boldface for scalars, vectors, and matrices, respectively. The transpose, conjugate-transpose, and pseudo-inverse are indicated by the superscripts $(\cdot)^{T}$ , $(\cdot)^{H}$ , and $(\cdot)^{\dagger }$ , respectively; $| \cdot |$ represents the absolute value of a scalar or the size of a set; $\| \cdot \|$ represents the modulus operation of a vector or a matrix; and $\mathbb {E}\{\cdot \}$ denotes expectation. For a matrix, $[\cdot]_{i,:}$ , $[\cdot]_{:,j}$ , and $[\cdot]_{i,j}$ return its $i$ th row, the $j$ th column, and the $(i,j)$ th entry, respectively. The Hadamard and Kronecker products are denoted by $\odot$ and $\otimes$ , respectively.

SECTION II.

Spherical Wave

In a traditional MIMO system, the aperture of the BS antenna array is usually negligible when compared with the distance between it and a user served by the BS. The entire array can be regarded as one point. Thus, a signal sent from the user experiences an equal path loss and has a common angle-of-arrival (AoA) when arriving at different antennas of the BS array. Experiencing equal path loss and having a common AoA are two key features of a plane wave, which is typically modeled in the far-field region. However, when the aperture of the BS array grows large, the array cannot be regarded as one point any more. Then, spherical waves kick in and the plane wave model becomes irrelevant. In this section, we will make a comprehensive study on spherical waves.

A. Channel Response on Point

We start from the modeling of channel response. In a three-dimensional (3-D) free space, an isotropic point source $\bf s$ is deployed at the origin of the coordinate system, i.e., ${\textbf {s}}=[{0, 0, 0}]^{T}$ , and radiates electromagnetic (EM) waves in all directions as the blue sphere shown in Fig. 2. For simplification, the transmit power of the point source is assumed uniform as 1. An antenna that covers a surface $\mathcal {A}$ with area $A$ is located in the radiative field of $\bf s$ , i.e., $\|{\textbf {p}}-{\textbf {s}}\| \gg \lambda$ holds for any point ${\textbf {p}}=[x_{p}, y_{p}, z_{p}]^{T} \in \mathcal {A}$ , where $\lambda = ({c}/{f})$ is the wavelength of the EM wave with frequency $f$ , while $c$ is the speed of light. Based on the complexity of the models, we illustrate three channel response models that have been reported in the literature as follows.

$Fig. 2. - (a) Isotropic point source radiates EM waves in all directions. (b) Receiver covers a continuous surface $\mathcal {A}$ on the sphere.$

Fig. 2.

(a) Isotropic point source radiates EM waves in all directions. (b) Receiver covers a continuous surface $\mathcal {A}$ on the sphere.

Show All

1) Channel Response Model 1:

The distance between the receiving point ${\textbf {p}}$ and the source point $\bf s$ is $\|{\textbf {p}}-{\textbf {s}}\|$ . At this distance, the power of the EM wave spreads uniformly on the sphere with radius $\|{\textbf {p}}-{\textbf {s}}\|$ . Since the area of this sphere is $4\pi \|{\textbf {p}}-{\textbf {s}}\|^{2}$ , the power on each point of this sphere equals [16] $\begin{equation*} \gamma \left ({{\textbf {p}},{\textbf {s}}}\right) = \frac {1}{4\pi \|{\textbf {p}}-{\textbf {s}}\|^{2}}.\tag{1}\end{equation*}$ View Source Then, the channel response on point ${\textbf {p}}$ can be expressed as follows: $\begin{align*} h_{\textrm {CR1}}\left ({{\textbf {p}},{\textbf {s}}}\right)=&\sqrt {\gamma \left ({{\textbf {p}},{\textbf {s}}}\right)} e^{-j \frac {2\pi }{\lambda }\|{\textbf {p}}-{\textbf {s}}\|} \\=&\frac {1}{\sqrt {4\pi } \|{\textbf {p}}-{\textbf {s}}\|} e^{-j \frac {2\pi }{\lambda }\|{\textbf {p}}-{\textbf {s}}\|}\tag{2}\end{align*}$ View Source which is referred to as channel response model 1.

Model 1 describes an ideal case where the power on point $\bf p$ is perfectly and completely harvested. It requires that the normal direction of ${\textbf {p}}$ with respect to surface $\mathcal A$ , denoted as ${\textbf {v}}_{\mathcal A}({\textbf {p}})\in \mathbb {R}^{3\times 1}$ and satisfying $\|{\textbf {v}}_{\mathcal A}({\textbf {p}})\|=1$ , exactly matches the radiation direction of the EM wave from source $\bf s$ [17]. As an example shown in Fig. 2(b), the surface $\mathcal A$ perfectly covers the sphere with radius $\|{\textbf {p}}-{\textbf {s}}\|$ . Then, for any point ${\textbf {p}}\in {\mathcal A}$ , the normal line of ${\textbf {p}}$ goes across the source $\bf s$ , and the channel response model 1 is applicable.

2) Channel Response Model 2:

In practice, patch antennas are widely utilized in mobile communication systems. Under this condition, the surface $\mathcal A$ of a patch antenna is a square. For a certain point ${\textbf {p}}\in {\mathcal A}$ , the normal direction does not always match the EM wave radiation direction. Then, the effective received power is a proportion of $\gamma ({\textbf {p}},{\textbf {s}})$ , and the proportionality factor is [18], [19], [20] $\begin{equation*} F\left ({{\textbf {p}},{\textbf {s}}}\right) = \cos < {\textbf {p}}-{\textbf {s}},{\textbf {v}}_{\mathcal A}\left ({{\textbf {p}}}\right)> = \frac {|\left ({{\textbf {p}}-{\textbf {s}}}\right)^{H}{\textbf {v}}_{\mathcal A}\left ({{\textbf {p}}}\right)|}{\|{\textbf {p}}-{\textbf {s}}\|}\tag{3}\end{equation*}$ View Source satisfying $0\le F({\textbf {p}},{\textbf {s}}) \le 1$ . The expression of $F({\textbf {p}},{\textbf {s}})$ in (3) is a typical form of the antenna pattern.

The channel response on point ${\textbf {p}}$ can be derived as follows: $\begin{align*} h_{\textrm {CR2}}\left ({{\textbf {p}},{\textbf {s}}}\right)=&\sqrt {F\left ({{\textbf {p}},{\textbf {s}}}\right)} h_{\textrm {CR1}}\left ({{\textbf {p}},{\textbf {s}}}\right) \\=&\sqrt {\frac {|\left ({{\textbf {p}}-{\textbf {s}}}\right)^{H}{\textbf {v}}_{\mathcal A}\left ({{\textbf {p}}}\right)|}{4\pi \|{\textbf {p}}-{\textbf {s}}\|^{3}}} e^{-j \frac {2\pi }{\lambda }\|{\textbf {p}}-{\textbf {s}}\|}\tag{4}\end{align*}$ View Source which is referred to as channel response model 2. We see that when $F({\textbf {p}},{\textbf {s}})=1$ holds, model 2 is equivalent to model 1.

3) Channel Response Model 3:

Papers [17], [21] considered the current density of the radiative EM waves from the source $\bf s$ , which is written as follows: $\begin{equation*} {\textbf {J}}\left ({{\textbf {s}}}\right) = J_{x}\left ({{\textbf {s}}}\right) {\textbf {u}}_{x} + J_{y}\left ({{\textbf {s}}}\right) {\textbf {u}}_{y} + J_{z}\left ({{\textbf {s}}}\right) {\textbf {u}}_{z}\tag{5}\end{equation*}$ View Source where ${\textbf {u}}_{x}=[{1, 0, 0}]^{T}$ , ${\textbf {u}}_{y}=[{0, 1, 0}]^{T}$ , and ${\textbf {u}}_{z}=[{0, 0, 1}]^{T}$ are the unit vectors along the $x$ , $y$ , and $z$ directions, respectively, while $J_{x}({\textbf {s}})$ , $J_{y}({\textbf {s}})$ , and $J_{y}({\textbf {s}})$ represent the current density in the $x$ , $y$ , and $z$ polarizations, respectively, satisfying the following normalization: $\begin{equation*} \|{\textbf {J}}\left ({{\textbf {s}}}\right)\|^{2} = |J_{x}\left ({{\textbf {s}}}\right)|^{2} + |J_{y}\left ({{\textbf {s}}}\right)|^{2} + |J_{z}\left ({{\textbf {s}}}\right)|^{2} = 1.\tag{6}\end{equation*}$ View Source Then, the effective received power at point $\bf p$ suffers further from the following proportionality factor: $\begin{equation*} \eta \left ({{\textbf {p}},{\textbf {s}}}\right) = \left \|{\left ({{\textbf {I}}-\frac {\left ({{\textbf {p}}-{\textbf {s}}}\right)\left ({{\textbf {p}}-{\textbf {s}}}\right)^{H}}{\|{\textbf {p}}-{\textbf {s}}\|^{2}} }\right){\textbf {J}}\left ({{\textbf {s}}}\right) }\right \|^{2}.\tag{7}\end{equation*}$ View Source As an example, [17] assumed that only the $y$ direction is excited in ${\textbf {J}}({\textbf {s}})$ , which means $J_{y}({\textbf {s}})=1$ and $J_{x}({\textbf {s}})=J_{z}({\textbf {s}})=0$ . Under this condition $\begin{equation*} \eta \left ({{\textbf {p}},{\textbf {s}}}\right) = 1 - \frac {\left [{{\textbf {p}}-{\textbf {s}}}\right]_{2}^{2}}{\|{\textbf {p}}-{\textbf {s}}\|^{2}}.\tag{8}\end{equation*}$ View Source We see that $\eta ({\textbf {p}},{\textbf {s}}) = 1$ happens when $[{\textbf {p}}-{\textbf {s}}]_{2}^{2}=0$ , i.e., $y_{p}=0$ .

With $\eta ({\textbf {p}},{\textbf {s}})$ , the channel response on point ${\textbf {p}}$ is written as follows: $\begin{align*} h_{\textrm {CR3}}\left ({{\textbf {p}},{\textbf {s}}}\right)=&\sqrt {\eta \left ({{\textbf {p}},{\textbf {s}}}\right)} h_{\textrm {CR2}}\left ({{\textbf {p}},{\textbf {s}}}\right) \\=&\left \|{\left ({{\textbf {I}}-\frac {\left ({{\textbf {p}}-{\textbf {s}}}\right)\left ({{\textbf {p}}-{\textbf {s}}}\right)^{H}}{\|{\textbf {p}}-{\textbf {s}}\|^{2}} }\right){\textbf {J}}\left ({{\textbf {s}}}\right) }\right \| \\&\times \sqrt {\frac {|\left ({{\textbf {p}}-{\textbf {s}}}\right)^{H}{\textbf {v}}_{\mathcal A}\left ({{\textbf {p}}}\right)|}{4\pi \|{\textbf {p}}-{\textbf {s}}\|^{3}}} e^{-j \frac {2\pi }{\lambda }\|{\textbf {p}}-{\textbf {s}}\|}\tag{9}\end{align*}$ View Source which is referred to as channel response model 3.

B. Channel of Antenna

By integrating the response across the entire surface $\mathcal {A}$ , the channel between the source and the receiver antenna that covers the surface $\mathcal {A}$ is calculated by [17] $\begin{equation*} h_{\mathcal {A}} = \frac {1}{\sqrt {A}}\int _{{\textbf {p}}\in {\mathcal {A}}} h\left ({{\textbf {p}},{\textbf {s}}}\right) d{\textbf {p}}.\tag{10}\end{equation*}$ View Source Here, we provide the following three examples from the literature to illustrate the channel response in different cases.

1) Case 1:

In this case, the receiver antenna is isotropic and located at ${\textbf {p}}=[0, 0, z_{p}]^{T}$ . The effective area of an isotropic antenna is [16] $\begin{equation*} A_{\textrm {iso}} = \frac {\lambda ^{2}}{4\pi }.\tag{11}\end{equation*}$ View Source Under channel response model 1, the channel between the source and the isotropic receiver antenna is derived as follows: $\begin{equation*} h_{\mathcal {A},{\textrm {case 1}}} = \sqrt {A_{\textrm {iso}}} h_{\textrm {CR1}}\left ({{\textbf {p}},{\textbf {s}}}\right) = \frac {\lambda }{4\pi |z_{p}|} e^{-j \frac {2\pi }{\lambda }|z_{p}|}.\tag{12}\end{equation*}$ View Source Then, the free space path loss seen by an isotropic receiver antenna at distance $z_{p}$ can be expressed as follows: $\begin{equation*} {\textrm {PL}}_{fs} = |h_{\mathcal {A},{\textrm {case 1}}}|^{2} = \frac {\lambda ^{2}}{16\pi ^{2} z_{p}^{2}}\tag{13}\end{equation*}$ View Source which is in accordance with the model in [22].

2) Case 2:

Case 2 illustrates a patch antenna whose surface $\mathcal {A}$ is a square plane. For any point ${\textbf {p}}\in {\mathcal A}$ , the normal direction ${\textbf {v}}_{\mathcal A}({\textbf {p}})$ is orthogonal to the surface $\mathcal A$ . The area $A_{\textrm {pat}}$ satisfies $\begin{equation*} A_{\textrm {pat}} \le \frac {\lambda ^{2}}{4}\tag{14}\end{equation*}$ View Source because the length and the width of the patch antenna are less than or equal to the antenna spacing $({\lambda }/{2})$ . Let $\mathcal {A}$ be parallel with the $xy$ plane. Then, we have ${\textbf {v}}_{\mathcal A}({\textbf {p}})={\textbf {u}}_{z}=[{0,0,1}]^{T}$ .

In [20], the channel on the patch antenna under channel response model 2 was studied. By applying (10), the channel can be approximated by $\begin{equation*} h_{\mathcal {A},{\textrm {case 2}}} \approx \sqrt {eA_{\textrm {pat}}} h_{\textrm {CR2}}\left ({{\textbf {p}}_{c},{\textbf {s}}}\right)\tag{15}\end{equation*}$ View Source where $eA_{\textrm {pat}}$ is the effective area of the antenna [20], $0< e\le 1$ is the proportionality factor, while ${\textbf {p}}_{c}=[x_{c}, y_{c}, z_{p}]^{T}$ is the center point of $\mathcal A$ . Notably, for an isotropic antenna in case 1, its area $A_{\textrm {iso}}$ in (11) is its effective area. In [19], the proportionality factor $e$ was not considered; that is to say, $A_{\textrm {pat}}$ was regarded as the effective area of the antenna. By applying (4), we obtain $\begin{equation*} h_{\textrm {CR2}}\left ({{\textbf {p}}_{c},{\textbf {s}}}\right) = \frac {|z_{p}|^{\frac {1}{2}}}{\sqrt {4\pi } \left ({x_{c}^{2} + y_{c}^{2} + z_{p}^{2}}\right)^{\frac {3}{4}}} e^{-j \frac {2\pi }{\lambda }\sqrt {x_{c}^{2} + y_{c}^{2} + z_{p}^{2}}}.\tag{16}\end{equation*}$ View Source Then, the channel between the source and a patch antenna is $\begin{equation*} h_{\mathcal {A},{\textrm {case 2}}} \approx \frac {\sqrt {eA_{\textrm {pat}}}|z_{p}|^{\frac {1}{2}}}{\sqrt {4\pi } \left ({x_{c}^{2} + y_{c}^{2} + z_{p}^{2}}\right)^{\frac {3}{4}}} e^{-j \frac {2\pi }{\lambda }\sqrt {x_{c}^{2} + y_{c}^{2} + z_{p}^{2}}}.\tag{17}\end{equation*}$ View Source If $x_{c}=y_{c}=0$ , then $h_{\textrm {CR2}}({\textbf {p}}_{c})$ turns to be $\begin{equation*} h_{\textrm {CR2}}\left ({{\textbf {p}}_{c},{\textbf {s}}}\right) = \frac {1}{\sqrt {4\pi } |z_{p}|} e^{-j \frac {2\pi }{\lambda } |z_{p}|}\tag{18}\end{equation*}$ View Source which is equivalent to $h_{\textrm {CR1}}({\textbf {p}},{\textbf {s}})$ in (2). Notably, $F_{\textrm {pat}}({\textbf {p}},{\textbf {s}}) \le 1$ , and the equation only holds for ${\textbf {p}}={\textbf {p}}_{c}$ . Furthermore, when $eA_{\textrm {pat}}=A_{\textrm {iso}}=({\lambda ^{2}}/{4 \pi })$ , we have $\begin{equation*} |h_{\mathcal {A},{\textrm {case 2}}}|^{2} < e A_{\textrm {pat}} |h_{\textrm {CR2}}\left ({{\textbf {p}}_{c},{\textbf {s}}}\right)|^{2} = \frac {\lambda ^{2}}{16\pi ^{2} z_{p}^{2}}\tag{19}\end{equation*}$ View Source which is exactly $|h_{\mathcal {A},{\textrm {case 1}}}|^{2}$ .

3) Case 3:

Case 3 studies a more complicated modeling of the channel on a patch antenna under channel response model 3, which was considered in [17] and [21]. The patch antenna with area $A_{\textrm {pat}}$ in case 2 is also considered; however, the proportionality factor $e$ related to the effective area is not introduced in case 3. The center point is ${\textbf {p}}_{c} = [0, 0, z_{p}]^{T}$ . Then, for any point ${\textbf {p}}\in \mathcal {A}$ , its $x$ and $y$ coordinates satisfy $\begin{equation*} -\frac {\sqrt {A_{\textrm {pat}}}}{2}\le x_{p}, y_{p} \le \frac {\sqrt {A_{\textrm {pat}}}}{2}.\tag{20}\end{equation*}$ View Source The current density ${\textbf {J}}({\textbf {s}})=[{0,1,0}]^{T}$ . Thus, (8) holds, i.e., $\begin{equation*} \eta \left ({{\textbf {p}},{\textbf {s}}}\right) = \frac {x_{p}^{2}+z_{p}^{2}}{x_{p}^{2}+y_{p}^{2}+z_{p}^{2}}.\tag{21}\end{equation*}$ View Source By applying (16) and (21), we have $\begin{align*} h_{\textrm {CR3}}\left ({{\textbf {p}},{\textbf {s}}}\right)=&\sqrt {\eta \left ({{\textbf {p}},{\textbf {s}}}\right)}h_{\textrm {CR2}}\left ({{\textbf {p}},{\textbf {s}}}\right) \\=&\frac {|z_{p}|^{\frac {1}{2}}\left ({x_{p}^{2}+z_{p}^{2}}\right)^{\frac {1}{2}}} {\sqrt {4\pi } \left ({x_{p}^{2}+y_{p}^{2}+z_{p}^{2}}\right)^{\frac {5}{4}}} e^{-j \frac {2\pi }{\lambda } \left ({x_{p}^{2}+y_{p}^{2}+z_{p}^{2}}\right)^{\frac {1}{2}}}.\tag{22}\end{align*}$ View Source

Given (20) and (22), according to (10), the channel between source $\bf s$ and the patch antenna is more difficult to derive in case 3 than in case 2. Therefore, [17] provided an upper bound of the channel gain as follows: $\begin{equation*} |h_{\mathcal {A},{\textrm {case 3}}}|^{2} = \left |{ \int _{\mathcal A} h_{\textrm {CR3}}\left ({{\textbf {p}},{\textbf {s}}}\right) \, d{\textbf {p}} }\right |^{2} \le \frac {1}{\pi } \left ({\frac {1}{3}\alpha + \frac {2}{3}\beta }\right)\tag{23}\end{equation*}$ View Source where $\begin{equation*} \alpha =\frac {\frac {A_{\textrm {pat}}}{4}|z_{p}|} {\left ({\frac {A_{\textrm {pat}}}{4}+z_{p}^{2}}\right) \left ({\frac {A_{\textrm {pat}}}{2}+z_{p}^{2}}\right)^{\frac {1}{2}}}\tag{24}\end{equation*}$ View Source and $\begin{equation*} \beta =\arctan \left ({\frac {\frac {A_{\textrm {pat}}}{4}} {|z_{p}| \left ({\frac {A_{\textrm {pat}}}{2}+z_{p}^{2}}\right)^{\frac {1}{2}}} }\right).\tag{25}\end{equation*}$ View Source If $A_{\textrm {pat}}=({\lambda ^{2}}/{4})$ , then $\begin{align*} \alpha=&\frac {\frac {\lambda ^{2}}{16}|z_{p}|} {\left ({\frac {\lambda ^{2}}{16}+z_{p}^{2}}\right) \left ({\frac {\lambda ^{2}}{8}+z_{p}^{2}}\right)^{\frac {1}{2}}} < \frac {\lambda ^{2}}{16 z_{p}^{2}} \\ \beta=&\arctan \left ({\frac {\frac {\lambda ^{2}}{16}} {|z_{p}| \left ({\frac {\lambda ^{2}}{8}+z_{p}^{2}}\right)^{\frac {1}{2}}} }\right) < \frac {\lambda ^{2}}{16 z_{p}^{2}}.\tag{26}\end{align*}$ View Source Under this condition $\begin{equation*} |h_{\mathcal {A},{\textrm {case 3}}}|^{2} \le \frac {\lambda ^{2}}{16\pi z_{p}^{2}}\tag{27}\end{equation*}$ View Source and this upper bound is $\pi$ times $|h_{\mathcal {A},{\textrm {case 2}}}|^{2}$ in (19) because the proportionality factor $e$ is not considered in case 3. Notably, recalling (21), we have $\eta ({\textbf {p}},{\textbf {s}})< 1$ , and $|h_{\textrm {CR3}}({\textbf {p}},{\textbf {s}})|\le |h_{\textrm {CR2}}({\textbf {p}},{\textbf {s}})|$ holds for arbitrary ${\textbf {p}}\in {\mathcal A}$ . Therefore, the upper bound in (27) is not tight.

C. Field Partition of Antenna

According to (2), (4), and (9), the channel response varies at different points on the surface spanned by an antenna. The variance of the channel response across the surface differs when the antenna is at different locations with respect to the source point $\bf s$ . If the antenna is close to $\bf s$ , then the channel response variance is significant across the surface. If the antenna is far from $\bf s$ , then $\mathcal A$ can be viewed as a point from the perspective of $\bf s$ and the channel response variance is negligible. Based on the magnitude of variance of both the amplitude and phase, the entire field of the source $\bf s$ can be divided into three fields/regions [21], [23].

Near field, in which both the amplitude and the phase variations of the channel response are nonnegligible across the surface.
Fresnel region, in which the amplitude variance of the channel response is negligible but the phase variance of the channel response is nonnegligible across the surface.
Fraunhofer region, also known as far field, in which both the amplitude and the phase variations of the channel response are negligible across the surface.

Some research works have considered the two-region partition by focusing only on the phase variance. In [24] and [25], the two regions are the Fresnel and the Fraunhofer regions, where the phase of channel response is dependent on and independent from the distance between transmitter and receiver, respectively. Another two-region partition can be found in [26], [27], [28], and [29], where the two regions were named as near and far fields, respectively. In the near field, a plane wavefront is created, whilst in the far field, a spherical wavefront is created.

1) Rayleigh/Fraunhofer Distance:

The Rayleigh or Fraunhofer distance is the boundary between the Fresnel and the Fraunhofer regions or that between the near and the far field [21], [26], [27]. It is defined by the maximum phase variance of the channel response. The maximum phase variance cannot exceed $({\pi }/{8})$ [23] in the Fraunhofer region or far field; otherwise, the receiver is in the Fresnel region or near field of the source. From (2), (4), and (9), we see that at point ${\textbf {p}}$ , and regardless of the channel response model, the phase of the channel response equals $\begin{equation*} \angle h_{\textrm {CR}}\left ({{\textbf {p}},{\textbf {s}}}\right)= -\frac {2\pi }{\lambda }\|{\textbf {p}}-{\textbf {s}}\|.\tag{28}\end{equation*}$ View Source Consider the widely used patch antenna in cases 2 and 3 as the receiver, whose surface is parallel with the $xy$ plane and the center ${\textbf {p}}_{c}$ is on the $z$ -axis. Then, the maximum phase variance can be computed by comparing the channel responses at the center and one vertex of the surface, respectively. Given the area $A_{\textrm {pat}}$ , the coordinate of one vertex of $\mathcal A$ can be ${\textbf {p}}_{v}=[({\sqrt {A_{\textrm {pat}}}}/{2}),({\sqrt {A_{\textrm {pat}}}}/{2}),z_{p}]^{T}$ . At the Rayleigh distance, the phase difference between $h_{\textrm {CR}}({\textbf {p}}_{c},{\textbf {s}})$ and $h_{\textrm {CR}}({\textbf {p}}_{v},{\textbf {s}})$ satisfies $\begin{equation*} |\angle h_{\textrm {CR}}\left ({{\textbf {p}}_{c},{\textbf {s}}}\right) - \angle h_{\textrm {CR}}\left ({{\textbf {p}}_{v},{\textbf {s}}}\right)| = \frac {\pi }{8}\tag{29}\end{equation*}$ View Source which can be further rewritten as follows: $\begin{align*} \|{\textbf {p}}_{v}-{\textbf {s}}\| -\|{\textbf {p}}_{c}-{\textbf {s}}\|=\sqrt {\frac {A_{\textrm {pat}}}{2}+z_{p}^{2}} - |z_{p}| = \frac {\lambda }{2\pi }\cdot \frac {\pi }{8} = \frac {\lambda }{16}. \tag{30}\end{align*}$ View Source Given that $\begin{equation*} \sqrt {1+x} \approx 1+\frac {x}{2}\tag{31}\end{equation*}$ View Source we have $\begin{equation*} \sqrt {\frac {A_{\textrm {pat}}}{2}+z_{p}^{2}} - |z_{p}| \approx \frac {A_{\textrm {pat}}}{4 |z_{p}|}.\tag{32}\end{equation*}$ View Source By applying (30) and (32), we obtain $\begin{equation*} |z_{p}| = \frac {4 A_{\textrm {pat}}}{\lambda }.\tag{33}\end{equation*}$ View Source The patch antenna can also be described by its aperture $D_{\textrm {pat}}$ , which satisfies $D_{\textrm {pat}}^{2}=2A_{\textrm {pat}}$ . Under this condition $\begin{equation*} |z_{p}| = \frac {2 D_{\textrm {pat}}^{2}}{\lambda }.\tag{34}\end{equation*}$ View Source Therefore, the Rayleigh distance is calculated by $\begin{equation*} d_{\textrm {Rayleigh}} = \frac {4 A_{\textrm {pat}}}{\lambda } = \frac {2 D_{\textrm {pat}}^{2}}{\lambda }.\tag{35}\end{equation*}$ View Source

2) Lower Bound of Fresnel Region:

Papers [20], [21], and [23] introduced a lower bound of the Fresnel region, which is defined by the maximum amplitude variance of the channel response across the surface. Unlike the variance of the phase, which is captured by the difference, the variance of the amplitude is described by the ratio $\begin{equation*} \Gamma =\frac {\min _{{\textbf {p}}\in \mathcal {A}}|h_{\textrm {CR}}\left ({{\textbf {p}},{\textbf {s}}}\right)|} {\max _{{\textbf {p}}\in \mathcal {A}}|h_{\textrm {CR}}\left ({{\textbf {p}},{\textbf {s}}}\right)|}.\tag{36}\end{equation*}$ View Source Denote the lower bound of the Fresnel region as $d_{\textrm {Fresnel}}$ . At distance $d_{\textrm {Fresnel}}$ , the amplitude ratio is equal to a threshold, i.e., $\Gamma =\Gamma _{\textrm {th}}\in (0,1)$ . The value of $\Gamma _{\textrm {th}}$ can be $\cos ({\pi }/{8})$ [21], [23], or $0.9^{2}$ [20]. Below this threshold, the variance of amplitude is nonnegligible across the surface. In [23], $d_{\textrm {Fresnel}}$ was regarded as the boundary between the Fresnel region and near-field region. In [21], when $d_{\textrm {Fresnel}}< d_{\textrm {Rayleigh}}$ holds, the region between these two boundaries was named as the Fresnel region.

We still consider the patch antenna above. The amplitude of the channel response has different expressions when different models are applied. According to (2), (4), and (9) $\begin{align*}&\min _{{\textbf {p}}\in \mathcal {A}}|h_{\textrm {CR}}\left ({{\textbf {p}},{\textbf {s}}}\right)|=\left |{h_{\textrm {CR}}\left ({{\textbf {p}}_{v},{\textbf {s}}}\right)}\right | \\&\max _{{\textbf {p}}\in \mathcal {A}}|h_{\textrm {CR}}\left ({{\textbf {p}},{\textbf {s}}}\right)|=\left |{h_{\textrm {CR}}\left ({{\textbf {p}}_{c},{\textbf {s}}}\right)}\right |.\tag{37}\end{align*}$ View Source Under channel response model 1 $\begin{equation*} |h_{\textrm {CR1}}\left ({{\textbf {p}},{\textbf {s}}}\right)| = \frac {1}{\sqrt {4\pi } \|{\textbf {p}}-{\textbf {s}}\|}\tag{38}\end{equation*}$ View Source and we have $\begin{equation*} \frac {\|{\textbf {p}}_{c}-{\textbf {s}}\|}{\|{\textbf {p}}_{v}-{\textbf {s}}\|} =\frac {|z_{p}|}{\sqrt {\frac {A_{\textrm {pat}}}{2}+z_{p}^{2}}}=\Gamma _{\textrm {th}}.\tag{39}\end{equation*}$ View Source By further applying (36), we obtain $\begin{align*} d_{\textrm {Fresnel, CR1}}=|z_{p}|=\sqrt {\frac {A_{\textrm {pat}}\Gamma _{\textrm {th}}^{2}}{2\left ({1- \Gamma _{\textrm {th}}^{2}}\right)}}=\frac {D_{\textrm {pat}}}{2}\sqrt {\frac {\Gamma _{\textrm {th}}^{2}}{1-\Gamma _{\textrm {th}}^{2}}}. \tag{40}\end{align*}$ View Source If $\Gamma _{\textrm {th}}=\cos ({\pi }/{8})$ , then $d_{\textrm {Fresnel, CR1}}\approx 1.2D_{\textrm {pat}}$ as given in [21], [23]. Under channel response model 2 $\begin{equation*} h_{\textrm {CR2}}\left ({{\textbf {p}},{\textbf {s}}}\right) = \sqrt {\frac {|\left ({{\textbf {p}}-{\textbf {s}}}\right)^{H}{\textbf {v}}_{\mathcal A}\left ({{\textbf {p}}}\right)|}{4\pi \|{\textbf {p}}-{\textbf {s}}\|^{3}}}.\tag{41}\end{equation*}$ View Source Recalling ${\textbf {v}}_{\mathcal A}({\textbf {p}})={\textbf {u}}_{z}$ , we derive that $\begin{equation*} \frac {\|{\textbf {p}}_{c}-{\textbf {s}}\|^{\frac {3}{2}}}{\|{\textbf {p}}_{v} -{\textbf {s}}\|^{\frac {3}{2}}}=\Gamma _{\textrm {th}}.\tag{42}\end{equation*}$ View Source Compared with (39), $d_{\textrm {Fresnel}}$ under channel model 2 satisfies $\begin{equation*} d_{\textrm {Fresnel, CR2}}=\sqrt {\frac {A_{\textrm {pat}}\Gamma _{\textrm {th}}^{\frac {4}{3}}}{2\left ({1-\Gamma _{\textrm {th}}^{\frac {4}{3}}}\right)}}=\frac {D_{\textrm {pat}}}{2}\sqrt {\frac {\Gamma _{\textrm {th}}^{\frac {4}{3}}}{1-\Gamma _{\textrm {th}}^{\frac {4}{3}}}}\tag{43}\end{equation*}$ View Source as derived in [20]. Similarly, under channel response model 3, by directly applying (22), it can be obtained that $\begin{equation*} \frac {d_{\textrm {Fresnel, CR3}}^{\frac {3}{2}}\left ({\frac {A_{\textrm {pat}}}{4}+d_{\textrm {Fresnel, CR3}}^{2}}\right)^{\frac {1}{2}}} {\left ({\frac {A_{\textrm {pat}}}{2}+d_{\textrm {Fresnel, CR3}}^{2}}\right)^{\frac {5}{4}}}=\Gamma _{\textrm {th}}.\tag{44}\end{equation*}$ View Source Generally, we have $\begin{equation*} d_{\textrm {Fresnel, CR3}} > d_{\textrm {Fresnel, CR2}} > d_{\textrm {Fresnel, CR1}}.\tag{45}\end{equation*}$ View Source

The lower bound of the Fresnel region can be alternatively calculated since the concept of near field is not unique. A Fresnel distance which equals $0.62\sqrt {({D_{\textrm {pat}}^{3}}/{\lambda })}$ is defined as the lower bound of the Fresnel region [24], [25], and this distance was also regarded as the upper bound of the reactive near field in [21].

D. Field Partition of Array

The field partition of a single antenna can be extended to that of a multiantenna array [20], [21]. Consider a widely applied uniform plane array (UPA) at the receiver. The UPA is composed of $N_{h}\times N_{v}$ antennas, where $N_{h}$ and $N_{v}$ are the numbers of columns and rows which are assumed to be even numbers. The distance between two horizontal or vertical adjacent antennas is $({\lambda }/{2})$ . The UPA is parallel with the $xy$ plane. The center of the UPA is ${\textbf {p}}_{c}=[0], [0], [d]^{T}$ , where $d>0$ is the distance between the source and the UPA. In an extreme case that the antennas are seamlessly deployed as shown in Fig. 3, the entire antenna array can be regarded as a large patch antenna with area $A_{\textrm {UPA}}=N_{h} N_{v} A_{\textrm {pat}}$ , where $A_{\textrm {pat}}=({\lambda ^{2}}/{4})$ . Then, one vertex of the UPA is at $\begin{equation*} {\textbf {p}}_{v}=\left [{\frac {N_{h} \lambda }{4},\frac {N_{v} \lambda }{4},d }\right]^{T}.\tag{46}\end{equation*}$ View Source The aperture of the UPA is $\begin{equation*} D_{\textrm {UPA}} = \frac {\sqrt {N_{h}^{2}+N_{v}^{2} }\lambda }{2}.\tag{47}\end{equation*}$ View Source

Fig. 3.

Example of a UPA and its geometrical relation with the source.

Show All

1) Rayleigh/Fraunhofer Distance:

The Rayleigh or Fraunhofer distance of the UPA is still defined by the maximum phase variance across the array, which equals $({\pi }/{8})$ . Recalling (28) and (29), we can write that $\begin{align*} \|{\textbf {p}}_{v}-{\textbf {s}}\| -\|{\textbf {p}}_{c}-{\textbf {s}}\|=\sqrt {\left ({N_{h}^{2}+N_{v}^{2} }\right) \frac {\lambda ^{2}}{16} + d^{2}} - d = \frac {\lambda }{2\pi }\cdot \frac {\pi }{8}. \!\!\tag{48}\end{align*}$ View Source Given (31) and after some derivations, we obtain the Rayleigh distance of the UPA as follows: $\begin{equation*} d_{\textrm {Rayleigh}} = \frac {\left ({N_{h}^{2}+N_{v}^{2} }\right) \lambda }{2} = \frac {2 D_{\textrm {UPA}}^{2}}{\lambda }\tag{49}\end{equation*}$ View Source which is still determined by the aperture.

2) Lower Bound of Fresnel Region:

Following a similar approach as in the single-antenna case, we further study the lower bound of the Fresnel region of the UPA. Under channel response model 1, by applying (39), we get that $\begin{equation*} \frac {\|{\textbf {p}}_{c}-{\textbf {s}}\|}{\|{\textbf {p}}_{v}-{\textbf {s}}\|}=\frac {d}{\sqrt {\left ({N_{h}^{2}+N_{v}^{2} }\right)\frac {\lambda ^{2}}{16}+d^{2}}}=\Gamma _{\textrm {th}}.\tag{50}\end{equation*}$ View Source Then, the lower bound of the Fresnel region is $\begin{align*} d_{\textrm {Fresnel, CR1}}=&\frac {\lambda }{4}\sqrt {\frac {\Gamma _{\textrm {th}}^{2}\left ({N_{h}^{2}+N_{v}^{2} }\right)}{\left ({1-\Gamma _{\textrm {th}}^{2}}\right)}} \\=&\frac {D_{\textrm {UPA}}}{2}\sqrt {\frac {\Gamma _{\textrm {th}}^{2}}{1-\Gamma _{\textrm {th}}^{2}}}.\tag{51}\end{align*}$ View Source Comparing (51) with (40), we see that $d_{\textrm {Fresnel}}$ of the UPA can be easily calculated by applying the aperture $D_{\textrm {UPA}}$ to the expression of $d_{\textrm {Fresnel}}$ of an antenna. Thus, under channel response model 2, we have $\begin{equation*} d_{\textrm {Fresnel, CR2}}=\frac {D_{\textrm {UPA}}}{2}\sqrt {\frac {\Gamma _{\textrm {th}}^{\frac {4}{3}}}{1- \Gamma _{\textrm {th}}^{\frac {4}{3}}}}.\tag{52}\end{equation*}$ View Source

Both $d_{\textrm {Rayleigh}}$ and $d_{\textrm {Fresnel}}$ increase with the aperture $D_{\textrm {UPA}}$ . We take two typical examples in mobile communication systems to illustrate the partition of the radiative field under typical array apertures. The first example considers an $8\times 8$ UPA working at 3.5 GHz. The second example involves a $512\times 64$ UPA working at 28 GHz. Table I provides the values of $D_{\textrm {UPA}}$ , $d_{\textrm {Rayleigh}}$ , $d_{\textrm {Fresnel, CR1}}$ , and $d_{\textrm {Fresnel, CR2}}$ in the two examples under the condition of $\Gamma _{\textrm {th}}=\cos ({\pi }/{8})$ .

TABLE I Examples of

$d_{\textrm{Rayleigh}}$ and

$d_{\textrm{Fresnel}}$

$Table I- Examples of $d_{\textrm{Rayleigh}}$ and $d_{\textrm{Fresnel}}$$

For Example 1, the aperture of a small-scale array is limited. Then, the values of $d_{\textrm {Rayleigh}}$ and $d_{\textrm {Fresnel}}$ are small. When the array is employed at a 5G new radio (NR) BS, whose serving cell has a width of 200 m, it is very likely that users in the cell are beyond the Rayleigh distance of the array, being in the far-field region. However, for Example 2, although the wavelength of a millimeter wave is small, the aperture of a $512\times 64$ UPA is much larger than the small-scale UPA in example 1. The Rayleigh distance of this extra-large array becomes $1.426\times 10^{3}$ m, which is much larger than the size of the serving cell. Then, users are no longer in the far-field region of the array. For some users, their distances from the UPA may even be smaller than $d_{\textrm {Fresnel}}$ .

E. Modeling of Channel Between Source and Array

Now, we study the channel model between the point source and the UPA. Denote by ${\textbf {H}}\in \mathcal {C}^{N_{h} \times N_{v}}$ the channel matrix and $h(n_{h}, n_{v})$ as the channel on the $(n_{h}, n_{v})$ th antenna, i.e., $\begin{align*} {\textbf {H}} = \left [{ \begin{matrix} h\left ({-\frac {N_{h}}{2}, -\frac {N_{v}}{2}}\right) & \cdots & h\left ({\frac {N_{h}}{2}-1,-\frac {N_{v}}{2}}\right) \\ \vdots & \ddots & \vdots \\ h\left ({-\frac {N_{h}}{2}, \frac {N_{v}}{2}-1}\right) & \cdots & h\left ({\frac {N_{h}}{2}-1, \frac {N_{v}}{2}-1}\right) \\ \end{matrix} }\right].\tag{53}\end{align*}$ View Source Based on the distance between the source and the UPA, which is denoted by $d$ , three models of $h(n_{h}, n_{v})$ can be derived [20], [30].

1) Channel Model 1:

This model is for the region $d < d_{\textrm {Fresnel}}$ . Considering that in this region, the channel’s amplitude and phase variations are nonnegligible across the array, $h(n_{h}, n_{v})$ in model 1 will have different amplitude and phase expressions for different $(n_{h}, n_{v})$ . That is to say $\begin{equation*} h_{\textrm {CM1}}\left ({n_{h}, n_{v}}\right) = |h\left ({n_{h}, n_{v}}\right)| e^{j \angle h\left ({n_{h}, n_{v}}\right)}\tag{54}\end{equation*}$ View Source which can be obtained by applying the geometrical information of antenna $(n_{h}, n_{v})$ in (10). Channel model 1 is referred to as the spherical wave channel model.

2) Channel Model 2:

This model is for the region of $d_{\textrm {Fresnel}} \le d \le d_{\textrm {Rayleigh}}$ , where the variance of amplitude is negligible across the array. A same $|h(n_{h}, n_{v})|$ is shared by all $(n_{h}, n_{v})$ and is simplified by $|h|$ , whose value can be assigned by $|h(n_{h}, n_{v})|, \forall (n_{h}, n_{v})$ . Here, we select $|h|=|h(0,0)|$ . Model 2 of $h(n_{h}, n_{v})$ is then expressed as follows: $\begin{equation*} h_{\textrm {CM2}}\left ({n_{h}, n_{v}}\right) = |h| e^{j \angle h\left ({n_{h}, n_{v}}\right)}.\tag{55}\end{equation*}$ View Source Channel model 2 is referred to as the reduced spherical wave channel model.

3) Channel Model 3:

This model is for the region of $d > d_{\textrm {Rayleigh}}$ , where the variations of amplitude and phase are both negligible across the array. A common $|h(n_{h}, n_{v})|$ is still applied here. Moreover, a uniformed value $\angle h$ is shared by all $(n_{h}, n_{v})$ . Similarly, we set $\angle h = \angle h(0,0)$ . Model 3 of $h(n_{h}, n_{v})$ is written as follows: $\begin{equation*} h_{\textrm {CM3}}\left ({n_{h}, n_{v}}\right) = |h| e^{j \angle h}.\tag{56}\end{equation*}$ View Source We should note that according to (56), all the antennas of the UPA experience the same channel with no difference among them. This stems from the UPA orientation, which is parallel with the $xy$ plane, while its center is ${\textbf {p}}_{c}=[0], [0], [d]^{T}$ . That is to say, the source is exactly on the normal line of the UPA which goes across the UPA center. Then, no path difference exists when the wave arrives at different antennas, and thus no phase difference is introduced among the channels on these antennas, as illustrated in Fig. 4(a).

Fig. 4.

Plane wave arrives at an array. (a) No phase difference exists across the array. (b) Phase difference is introduced across the array.

Show All

The model in (56) also means that the incident wave seen by each antenna comes from the same direction. That is, a plane wave instead of a spherical wave is experienced at the UPA. Hence, channel model 3 is referred to as the plane wave channel model. Consider a more general case that the UPA is parallel with the $xy$ plane and ${\textbf {p}}_{c}=[x_{c}, y_{c}, z_{c}]^{T}$ . As shown in Fig. 3, the included angle between the incident wave and a column of the UPA is $\begin{equation*} \theta = \arccos \frac {y_{c}}{\sqrt {x_{c}^{2}+ y_{c}^{2}+ z_{c}^{2}}}.\tag{57}\end{equation*}$ View Source The included angle between the projection of the incident wave on the UPA and a row of the UPA is $\begin{equation*} \phi = \arccos \frac {x_{c}}{\sqrt {x_{c}^{2}+ y_{c}^{2}}}.\tag{58}\end{equation*}$ View Source The position of the $(n_{h},n_{v})$ th antenna is $\begin{align*} {\textbf {p}}_{n_{h},n_{v}}=&\left [{ x_{c}+\frac {2n_{h}+1}{4}\lambda, y_{c}+\frac {2n_{v}+1}{4}\lambda, z_{c}}\right]^{T} \\ n_{h}=&-\frac {N_{h}}{2},\ldots,\frac {N_{h}}{2}-1, n_{v} = -\frac {N_{v}}{2},\ldots,\frac {N_{v}}{2}-1.\tag{59}\end{align*}$ View Source Model 3 of $h(n_{h}, n_{v})$ becomes $\begin{equation*} h_{\textrm {CM3}}\left ({n_{h}, n_{v}}\right) = |h| e^{j \left ({\angle h+\Delta \phi \left ({n_{h}, n_{v}}\right) }\right)}\tag{60}\end{equation*}$ View Source under the plane incident wave from direction $(\theta,\phi)$ , where $\begin{equation*} \Delta \phi \left ({n_{h}, n_{v}}\right)=\pi \left ({n_{h} \cos \theta + n_{v} \sin \theta \cos \phi }\right)\tag{61}\end{equation*}$ View Source is the difference between $\angle h(n_{h}, n_{v})$ and $\angle h$ caused by the path different shown in Fig. 4(b). If $x_{c}= y_{c}=0$ , then $\theta =\phi =({\pi }/{2})$ and $\Delta \phi (n_{h}, n_{v})=0$ , which corresponds to the case in (56).

Channel model 3 has been widely applied in the fourth-generation and 5G systems, since the aperture of antenna arrays is not large and users are in the far-field region of the array [31], [32], [33]. However, in 6G systems, extra large-aperture arrays, such as example 2 in Table I, will be employed. Then, users probably fall in the near-field region or the Fresnel region, and channel models 1 and 2 should be utilized. The presence of spherical waves instead of plane waves is one of the major unique characteristics of extra large-scale MIMO systems. Thus, far-field channel models will become inaccurate in the practical near or Fresnel field [11].

SECTION III.

Visibility Region

When a user is very close to an extra large-aperture array, most of the channel power can be captured by only a part of the array. This part of the array is referred to as the VR of the user w.r.t. the array. The VR is another key characteristic in extra large-scale MIMO systems [3], [6], [34], [35]. In this section, we will make a comprehensive study on the origins, definition, and modeling of the VR.

A. Origins of the VR

The VR reflects the uneven distribution of the channel power over the array. There are two major manifestations behind the creation of the VR [6]. One is the unequal path loss across different antennas of the array. The other is the blockage stemming from the obstacles between the user and the array.

1) Unequal Path Loss:

When the distance between a user and the array is below $d_{\textrm {Fresnel}}$ , channel model 1 should be applied. Under this condition, $|h(n_{h}, n_{v})|$ , which reflects the path loss on the antenna array, varies significantly with $(n_{h}, n_{v})$ . Let us revisit the UPA in Fig. 3, which is parallel with the $xy$ plane while its center is at ${\textbf {p}}_{c}=[0], [0], [d]^{T}$ , satisfying $d< d_{\textrm {Fresnel}}$ . The source $\bf s$ is still located at the origin of the coordinate system. We analyze the value of $|h(n_{h}, n_{v})|$ across the UPA under the three cases in Section II-B. For antenna $(n_{h}, n_{v})$ in case 1, by applying ${\textbf {p}}_{n_{h},n_{v}}$ in (12), we obtain $\begin{equation*} |h_{\textrm {case 1}}\left ({n_{h}, n_{v}}\right)| \propto \frac {1}{\|{\textbf {p}}_{n_{h},n_{v}}-{\textbf {s}}\|}.\tag{62}\end{equation*}$ View Source With $d$ fixed and given (59), $\|{\textbf {p}}_{n_{h},n_{v}}-{\textbf {s}}\|$ has the minimum value at $n_{h}=n_{v}=0$ and the maximum value at $n_{h}=-({N_{h}}/{2})$ , $n_{v}=-({N_{v}}/{2})$ , which are the center and the vertex of the UPA, respectively. The minimum and the maximum values of $|h_{\textrm {case 1}}(n_{h}, n_{v})|$ satisfy $\begin{equation*} \frac {\min |h_{\textrm {case 1}}\left ({n_{h}, n_{v}}\right)|}{\max |h_{\textrm {case 1}}\left ({n_{h}, n_{v}}\right)|} = \frac {\|{\textbf {p}}_{c}-{\textbf {s}}\|}{\|{\textbf {p}}_{v}-{\textbf {s}}\|} = \frac {d}{\sqrt {d^{2}+\frac {D_{\textrm {UPA}}^{2}}{4}}}.\tag{63}\end{equation*}$ View Source Recalling (51) and Table I, we know that the value of $d_{\textrm {Fresnel}}$ is close to $D_{\textrm {UPA}}$ . If $d=D_{\textrm {UPA}}$ , then the ratio equals 0.89. With the decrease of $d$ , the ratio drops. For an extra large-scale array which widely spreads over a long wall, it is possible that $d=D_{\textrm {UPA}}/10$ . Then, the ratio approximates 0.2. If the source moves on the $xy$ plane, then the ratio of the minimum and the maximum of $|h_{\textrm {case 1}}(n_{h}, n_{v})|$ is further reduced.

A similar phenomenon can be observed when the UPA is customized as described in cases 2 and 3. Considering that $\begin{align*}&0< F\left ({{\textbf {p}}_{v},{\textbf {s}}}\right)< F\left ({{\textbf {p}}_{c},{\textbf {s}}}\right)\le 1\tag{64}\\&0< \eta \left ({{\textbf {p}}_{c},{\textbf {s}}}\right)< \eta \left ({{\textbf {p}}_{c},{\textbf {s}}}\right)\le 1\tag{65}\end{align*}$ View Source we have $\begin{align*} \frac {\min |h_{\textrm {case 3}}\left ({n_{h}, n_{v}}\right)|}{\max |h_{\textrm {case 3}}\left ({n_{h}, n_{v}}\right)|}\le&\frac {\min |h_{\textrm {case 2}}\left ({n_{h}, n_{v}}\right)|}{\max |h_{\textrm {case 2}}\left ({n_{h}, n_{v}}\right)|} \\\le&\frac {\min |h_{\textrm {case 1}}\left ({n_{h}, n_{v}}\right)|}{\max |h_{\textrm {case 1}}\left ({n_{h}, n_{v}}\right)|}.\tag{66}\end{align*}$ View Source This phenomenon can be observed in Fig. 5, where the source is very close to the center of the array. We see that in case 3, the value of $|h(n_{h}, n_{v})|$ is significantly larger for smaller $|n_{h}|$ . In an extreme case that the ratio of the minimum and the maximum of $|h(n_{h}, n_{v})|$ approaches zero, the channel power on a proportion of antennas in the array is negligible; for example, the first and the last columns of the array in case 3 of Fig. 5. The channel power can be captured by a proportion of antennas in the array. Particularly, the channel power is concentrated on the antennas that are close to the source. The channels on the antennas that are much farther to the source are significantly weaker.

$Fig. 5. - Comparison of $({|h(n_{h}, n_{v})|}/[{\max |h(n_{h}, n_{v})|}])$ in antenna cases 1, 2, and 3 when $N_{h}=512$ , $N_{v}=64$ , and $d=100\lambda $ .$

Fig. 5.

Comparison of $({|h(n_{h}, n_{v})|}/[{\max |h(n_{h}, n_{v})|}])$ in antenna cases 1, 2, and 3 when $N_{h}=512$ , $N_{v}=64$ , and $d=100\lambda$ .

Show All

2) Blockage Due to Obstacles:

An extra large-aperture array can be widely spread on the wall of a building in an urban city. Then, $D_{\textrm {UPA}}$ will be large. Normally, users are usually crowded in an urban environment and may be very close to the array. Trees, cars, and infrastructures can be seen everywhere and can all be possible obstacles in the channel between the array and a certain user.

Unlike in the far field, where the entire channel is blocked, in the near field or the Fresnel region, only a part of the array may be blocked. The blocked part of the array is determined by the geometry of the array, user, and obstacle, as shown in Fig. 6. Assume that only a line-of-sight (LoS) path exists. For antenna $(n_{h}, n_{v})$ , if the line between ${\textbf {p}}_{n_{h},n_{v}}$ and ${\textbf {s}}$ goes across the obstacle, then the channel on antenna $(n_{h}, n_{v})$ is blocked, i.e., $|h(n_{h}, n_{v})|=0$ . The blocked part of the array also reflects the shape of the obstacle. As illustrated in Fig. 6, the blocked subarrays in gray color take identical patterns with the tree and the car, respectively. The uneven channel power distribution caused by blockage is independent of that resulting from the unequal path loss.

Fig. 6.

Blockage of the channels on part of the array caused by obstacles, such as trees and cars. Red and gray squares represent the antennas whose channels are connected and blocked, respectively.

Show All

B. Definition of the VR

Uneven power distribution across the array is a new channel feature that appears when extra large-aperture arrays are employed in wireless systems. Then, the VR of a user w.r.t. the array is introduced to model the uneven channel power distribution [34], [35], [36], [37], [38], [39], [40], [41], [42], [43], [44], [45]. Actually, the VR is not a novel concept. In this section, we introduce different VR categories.

1) VR of User w.r.t. the Array:

In the literature, the VR of a user w.r.t. the array is defined as the part of the array that captures the biggest proportion of the channel power over the entire array [6], [34], [35], [36]. It reflects the sparsity of a user channel in the antenna domain. Denote the VR of a user w.r.t. the array as $\Phi _{\textrm {UA}}$ . Then, $\Phi _{\textrm {UA}}$ is a set that contains the indices of antennas that the channel power of this user is concentrated on. The following property holds: $\begin{equation*} \frac {\sum _{\left ({n_{h}, n_{v}}\right)\in \Phi _{\textrm {UA}}} |h\left ({n_{h}, n_{v}}\right)|^{2}}{\|{\textbf {H}}\|^{2}_{F}} \ge \zeta\tag{67}\end{equation*}$ View Source where $0< \zeta \le 1$ is a threshold with value close to 1. Note that $\Phi _{\textrm {UA}}$ contains the minimum number of antennas that satisfy the requirement of (67). The size of $\Phi _{\textrm {UA}}$ is denoted by $|\Phi _{\textrm {UA}}|$ .

We first consider the channel under unequal path loss but without blockage. The VR caused simply by a spherical wavefront covers a continuous part of the array. Recall the example in Fig. 5, where $\Phi _{\textrm {UA}}$ covers the middle part of the UPA. Note that $\zeta =1$ only holds when $\Phi _{\textrm {UA}}$ covers the entire array. However, when we set $\zeta < 1$ , we can still find a proper $\Phi _{\textrm {UA}}$ to achieve (67). The $\Phi _{\textrm {UA}}$ obtained here is the size of a sliding window that covers the antennas in the array that captures $\zeta$ percentage of the channel power. Antennas out of the window still receive nonnegligible power and can be excluded from $\Phi _{\textrm {UA}}$ .

The VR can be obvious if a blockage occurs. At the first antenna in the blocked subarray, a sharp decrease of the channel power can be observed. Consider now the LoS channel case without any non-LoS (NLoS) paths. Then, the channel power on each antenna in the blocked subarray is zero. In (67), $\zeta =1$ can be achieved even though $|\Phi _{\textrm {UA}}|< N_{h} N_{v}$ . Under this condition, $\Phi _{\textrm {UA}}$ contains the antennas that are not blocked by obstacles. If we set $\zeta < 1$ , then the size of $\Phi _{\textrm {UA}}$ can be even smaller by discarding the antennas with the smallest power. Notably, a VR caused by blockage may not be continuous. One obstacle may block the channels on a continuous subarray. If the blocked subarray is at the array center, then the VR will not be continuous. If multiple obstacles exist, the VR may be composed of several discontinuous subarrays.

2) Two-Tier VRs:

In the previous context, we focused on the case that only the LoS path exists in the channel. In practice, the wireless propagation environment is composed of various scatterers. Signals can be scattered and then arrive at the array along NLoS paths as well. Unlike the obstacles that block the signal propagation, scatterers provide new propagation paths and act as intermediate nodes. Then, the one-tier user–array channel becomes a two-tier user–scatterer and scatterer–array channel. Accordingly, the VR of a user w.r.t. the array is further partitioned by the VR of a scatterer w.r.t. the array and the VR of a user w.r.t. the scatterers [6], [39], [46], [47], [48], [49].

The scatterers are usually grouped into multiple clusters. Each cluster includes one or multiple neighboring scatterers. Scatterers in a cluster see the same antennas in the array and can be simultaneously observed by a user. The VR of a cluster w.r.t. the array, denoted by $\Phi _{\textrm {CA}}$ , contains the antennas that can be seen by the cluster. This definition is similar to that of the VR of a user w.r.t. the array. A cluster here corresponds to the user above; $\Phi _{\textrm {CA}}$ is also named as the cluster VR and is assumed to cover a continuous subarray [47]. The central antenna in $\Phi _{\textrm {CA}}$ has the highest channel power [47]. Furthermore, if $\Phi _{\textrm {CA}}$ includes all the array antennas, then the scatterers in this cluster are referred to as entirely visible scatterers; otherwise, they are referred to as partially visible scatterers [46].

The VR of a user w.r.t. the clusters, denoted by $\Phi _{\textrm {UC}}$ , contains the clusters that can be seen by the user. This is similar to the original concept of VR in COST channel models, which refers to a geometric region where a same set of scatterer clusters can be seen if the user is in this region [50]. If the user moves to another position, then the clusters that can be seen by this user vary. Note that, $\Phi _{\textrm {UC}}$ is also named as user VR in [47].

By cascading the two-tier VRs, the VR of a user w.r.t. the array can be obtained. For user $k$ , its one-tier VR and two-tier VRs have the following relation: $\begin{equation*} \Phi _{{\textrm {UA}},k} = \bigcup _{c\in \Phi _{{\textrm {UC}},k}} \Phi _{{\textrm {CA}},c}\tag{68}\end{equation*}$ View Source where the one-tier VR $\Phi _{{\textrm {UA}},k}$ denotes the VR of user $k$ w.r.t. the array, while the two-tier VRs $\Phi _{{\textrm {UC}},k}$ and $\Phi _{{\textrm {CA}},c}$ represent the VR of user $k$ w.r.t. the clusters and the VR of cluster $c$ w.r.t. the array, respectively. As an example in Fig. 7, $\Phi _{{\textrm {UC}},1}=\{1\}$ and $\Phi _{{\textrm {UC}},2}=\{2,3\}$ . Thus, we have $\Phi _{{\textrm {UA}},1}=\Phi _{{\textrm {CA}},1}$ and $\Phi _{{\textrm {UA}},2}=\Phi _{{\textrm {CA}},2}\bigcup \Phi _{{\textrm {CA}},3}$ .

Fig. 7.

Example of the two-tier VRs. A circular subarray in a particular color represents the VR of the cluster in the same color.

Show All

C. Channel Modeling With VR

Now, we investigate the modeling of a channel with VR. According to (67), the VR almost harvests the total power of channel. To simplify the expression, only the channel in VR is modeled to be nonzero, and the channel out of the VR is assumed to be zero. There have been channel models that capture the new feature of VR [34], [35], [36], [37], [38], [39], [40], [41], [42], [43], [44], [45], [46], [47], [49], where the VR is described in different ways.

1) Channel Covariance Matrix With VR:

A channel covariance matrix reflects the statistical covariance of channels across different antennas. It has been widely applied in the modeling of multiantenna channels. When the channel experiences correlated Rayleigh fading, the channel between the single-antenna user $k$ and the $N$ -dimensional array can be modeled as [51], [52], [53] $\begin{equation*} {\textbf {h}}_{k} \sim \mathcal {CN}\left ({{\textbf {0}}, {\textbf {R}}_{k}}\right)\tag{69}\end{equation*}$ View Source where ${\textbf {h}}_{k}\in \mathbb {C}^{N\times 1}$ is the multiantenna complex channel with zero mean, and ${\textbf {R}}_{k} \in \mathbb {C}^{N\times N}$ is the channel covariance matrix satisfying $\begin{equation*} {\textbf {R}}_{k} = \mathbb {E}\left \{{{\textbf {h}}_{k} {\textbf {h}}_{k}^{H}}\right \}.\tag{70}\end{equation*}$ View Source This model is equivalent to $\begin{equation*} {\textbf {h}}_{k} = {\textbf {R}}_{k}^{\frac {1}{2}}{\textbf {h}}_{w,k}\tag{71}\end{equation*}$ View Source where ${\textbf {h}}_{w,k}\in \mathbb {C}^{N\times 1}$ is the small-scale fading coefficient vector, whose entries are independent, identically distributed (i.i.d.) complex Gaussian random variables with zero mean and unit variance. In traditional multiantenna systems, the diagonal entries of ${\textbf {R}}_{k}$ are nonzero. However, if the VR is introduced, then only the diagonal entries in the VR are nonzero [35], [36], [37], [38], [40], [41], [42], [44]. Moreover, ${\textbf {R}}_{k}$ shows block sparsity. For $n_{1}, n_{2} \in [1,N]$ , if $n_{1} \notin \Phi _{{\textrm {UA}},k}$ or $n_{2} \notin \Phi _{{\textrm {UA}},k}$ , then ${\textbf {R}}(n_{1},n_{2})=0$ . In a typical case that $\Phi _{{\textrm {UA}},k}$ covers a continuous region, ${\textbf {R}}_{k}$ has the following structure: $\begin{align*} {\textbf {R}}_{k} = \begin{bmatrix} {\textbf {0}} & & \\ & {\textbf {R}}_{{\textrm {UA}},k} & \\ & & {\textbf {0}} \\ \end{bmatrix}\tag{72}\end{align*}$ View Source where ${\textbf {R}}_{{\textrm {UA}},k}\in \mathbb {C}^{|\Phi _{{\textrm {UA}},k}| \times |\Phi _{{\textrm {UA}},k}|}$ is the covariance submatrix with nonzero entries. Given the block sparsity of ${\textbf {R}}_{k}$ , the channel model (71) can be further rewritten as follows: $\begin{equation*} {\textbf {h}} = {\textbf {D}}_{{\textrm {UA}},k}{\textbf {R}}_{{\textrm {UA}},k}^{\frac {1}{2}} {\textbf {h}}_{{\textrm {UA}},w,k}\tag{73}\end{equation*}$ View Source where ${\textbf {D}}_{{\textrm {UA}},k}=\{0,1\}^{N \times |\Phi _{{\textrm {UA}},k}|}$ is a selection matrix that selects the antennas in $\Phi _{{\textrm {UA}},k}$ , while the dimension of ${\textbf {h}}_{{\textrm {UA}},w,k}$ is $|\Phi _{{\textrm {UA}},k}|\times 1$ .

When scatterer clusters are further considered, the scatterers can be regarded as a virtual antenna array [46]. In traditional multiantenna systems, the covariance matrix-based scattering channel model is [52] $\begin{equation*} {\textbf {h}}_{k} = {\textbf {R}}_{A}^{\frac {1}{2}} {\textbf {H}}_{w} {\textbf {R}}_{\textrm {S}}^{\frac {1}{2}} {\textbf {h}}_{w,k}\tag{74}\end{equation*}$ View Source where ${\textbf {R}}_{A}\in \mathbb {C}^{N\times N}$ and ${\textbf {R}}_{\textrm {S}}\in \mathbb {C}^{S\times S}$ are the covariance matrices at the array and the scatterer side, respectively, $S$ is the number of scatterers, and ${\textbf {H}}_{w}\in \mathbb {C}^{N\times S}$ and ${\textbf {h}}_{w,k}\in \mathbb {C}^{S\times 1}$ are small-scale fading matrices (vectors). In extra large-aperture array systems, since different clusters of scatterers have different VRs, (74) needs to be rewritten. Assume that the number of clusters is $C$ . In cluster $c$ , there are $S_{c}$ scatterers, satisfying $\sum _{c=1}^{C} S_{c} = S$ . The total number of scatterers that can be seen by user $k$ is ${\tilde S}_{k} = \sum _{c\in \Phi _{{\textrm {UC}},k}} S_{c}$ . By cascading the array–scatterer channel and the scatterer-user channel together, the channel between the array and user $k$ is expressed as follows: $\begin{equation*} {\textbf {h}} = \left [{ {\textbf {G}}_{1},\ldots,{\textbf {G}}_{C} }\right] {\textbf {R}}_{\textrm {S}}^{\frac {1}{2}} {\textbf {D}}_{{\textrm {UC}},k} {\textbf {h}}_{w,k}\tag{75}\end{equation*}$ View Source where ${\textbf {D}}_{{\textrm {UC}},k}=\{0,1\}^{S \times {\tilde S}_{k}}$ is the selection matrix that selects the scatterers that can be seen by user $k$ , ${\textbf {h}}_{w,k}\mathbb {C}^{\tilde S_{k} \times 1}$ is the small-scale fading vector, while $\begin{equation*} {\textbf {G}}_{c} = {\textbf {D}}_{{\textrm {CA}},c}{\textbf {R}}_{{\textrm {CA}},c}^{\frac {1}{2}} {\textbf {H}}_{w,c} \in \mathbb {C}^{N\times S_{c}}\tag{76}\end{equation*}$ View Source is the separate channel between the array and cluster $c$ , ${\textbf {D}}_{{\textrm {CA}},c}=\{0,1\}^{N \times |\Phi _{{\textrm {CA}},c}|}$ selects the antennas that can be seen by cluster $c$ , ${\textbf {R}}_{{\textrm {CA}},c} \in \mathbb {C}^{|\Phi _{{\textrm {CA}},c}|\times |\Phi _{{\textrm {CA}},c}|}$ is the covariance matrix across antennas within $\Phi _{{\textrm {CA}},c}$ , while ${\textbf {H}}_{w,c} \in \mathbb {C}^{|\Phi _{{\textrm {CA}},c}|\times S_{c}}$ models the small-scale fading. This model has been applied in [46], [47], and [48].

Channel covariance matrix-based channel models pave the way for the analysis of key performance indicators, such as the signal-to-interference and noise (SINR) [35] and the ergodic capacity [46], which further helps the design of transceivers.

2) Steering Vectors With VR:

The discrete physical model is another widely used multiantenna channel model [54], [55]. It focuses on the distinguished paths in the environment. The discrete physical channel model is expressed as follows: $\begin{equation*} {\textbf {h}}_{k} = \sum _{c\in \Phi _{{\textrm {UC}},k}} \sum _{s=1}^{S_{c}} \beta _{c,s} {\textbf {a}}_{c,s}\tag{77}\end{equation*}$ View Source where $\beta _{c,s}$ is the complex coefficient of the path resulting from scatterer $s$ in cluster $c$ , which also represents the response of this path on the reference antenna, and ${\textbf {a}}_{c,s} \in \mathbb {C}^{N\times 1}$ is the steering vector of the path that involves the difference of response on each antenna w.r.t. the reference antenna. In traditional multiantenna systems, the plane wave channel model 3 in (60) is utilized to construct the steering vector ${\textbf {a}}_{c,s}$ , satisfying $[{\textbf {a}}_{c,s}]_{n} = e^{j\Delta \phi _{n}}$ , where $\Delta \phi _{n}$ is expressed in (61). Each element of ${\textbf {a}}_{c,s}$ has amplitude equal to 1.

In extra large-aperture array systems, when introducing the concept of VR, the limited dimensional channel model becomes [39], [49] $\begin{equation*} {\textbf {h}}_{k} = \sum _{c\in \Phi _{{\textrm {UC}},k}} \sum _{s=1}^{S_{c}} \beta _{c,s} {\textbf {a}}_{c,s} \odot {\textbf {p}}_{c}\tag{78}\end{equation*}$ View Source where ${\textbf {p}}_{c} = \{0,1\}^{N \times 1}$ is the VR mask vector of cluster $c$ with the following entries: $\begin{align*} \left [{{\textbf {p}}_{c}}\right]_{n} = \begin{cases} 1, & {\mathrm {if} n \in \Phi _{{\textrm {CA}},c}} \\ 0, & \text {else.} \end{cases}\tag{79}\end{align*}$ View Source The steering vector with VR mask, i.e., ${\textbf {a}}_{c,s} \odot {\textbf {p}}_{c}$ , can be regarded as the effective steering vector. Notably, when an extra large-aperture array is deployed, ${\textbf {a}}_{c,s}$ has the forms of the spherical wave channel models 1 and 2 in (54) and (55). In fact, when applying channel model 1, the entries in the steering vector ${\textbf {a}}_{c,s}$ have different amplitudes, directly reflecting the VR caused by unequal path loss.

Depending on whether blockage happens or not, the VR mask ${\textbf {p}}_{c}$ should be set in two different ways. Take an example in Fig. 8. The amplitude of ${\textbf {a}}_{c,s}$ varies significantly across the array as shown by the blue circles. If an obstacle exists, part of the array is blocked; then, ${\textbf {p}}_{c}$ covers the red windows where the blockage does not take effect. However, if there is no obstacle, then ${\textbf {p}}_{c}$ selects the green window that captures the majority of power with the minimum window size. Notably, in the case of no blockage, ${\textbf {p}}_{c}$ can be an all-one vector, contributing to a precise extra large-aperture array channel model. Introducing a zero-one mask vector will result in an approximated channel model with a reduced dimension, which further helps to reduce the complexity of transceiver design.

$Fig. 8. - Example of different VR masks ${\textbf {p}}_{c}$ with and without blockage.$

Fig. 8.

Example of different VR masks ${\textbf {p}}_{c}$ with and without blockage.

Show All

D. Spatial Nonstationarity

The spherical wave propagation as well as the VR caused by blockages contribute to spatial nonstationarity, which is the new channel property that appears in extra large-aperture array systems. The concept of spatial stationarity of a multiantenna channel is derived from the wide sense stationarity of a stochastic process [56], where the stochastic process becomes the multiantenna channel $\bf h$ here. Note that the spatial stationarity of a multiantenna channel is different from the stationarity of a time-varying channel [57]. The multiantenna channel $\bf h$ is spatially stationary if the correlation of any two distinct elements of $\bf h$ only depends on the difference of the two-element indices. That is to say $\begin{equation*} \mathbb {E}\left \{{ \left [{{\textbf {h}}}\right]_{l+m}^{\ast} \left [{{\textbf {h}}}\right]_{l+n}}\right \} = \mathbb {E}\left \{{ \left [{{\textbf {h}}}\right]_{m}^{\ast} \left [{{\textbf {h}}}\right]_{n}}\right \}\tag{80}\end{equation*}$ View Source holds for arbitrary $l\in [{0,N-1}]$ . Otherwise, the multiantenna channel $\bf h$ is spatially nonstationary.

If the VR of the user w.r.t. the array does not cover the entire array, then the channel is definitely spatially nonstationary. This is because $\begin{align*} \mathbb {E}\left \{{ \left [{{\textbf {h}}}\right]_{n} }\right \} \begin{cases} >0, & {\mathrm {if} n \in \Phi _{{\textrm {UA}}}} \\ =0, & \text {else} \end{cases}\tag{81}\end{align*}$ View Source holds regardless of which channel model from (73), (76), and (78) is applied. Thereafter, we have $\begin{align*} \mathbb {E}\left \{{ \left [{{\textbf {h}}}\right]_{m}^{\ast} \left [{{\textbf {h}}}\right]_{n}}\right \} \begin{cases} >0, & {\mathrm {if} m,n \in \Phi _{{\textrm {UA}}}} \\ =0, & \text {else.} \end{cases}\tag{82}\end{align*}$ View Source The requirement for spatial stationarity cannot be satisfied when $\Phi _{{\textrm {UA}}}\subsetneq [1,N]$ .

If the VR of the user w.r.t. the array covers the entire array, but the user is in the near field or Fresnel region of the array, then the multiantenna channel $\bf h$ is still spatially nonstationary [58]. Under this condition, the channel models (54) and (55) should be utilized. More specifically, when applying (54), $\mathbb {E}\{ [{\textbf {h}}]_{n} \}$ has unequal amplitudes for different $n$ due to the unequal path loss. Furthermore, the phase of $[{\textbf {h}}]_{n}$ is dependent on the index $n$ whenever (54) or (55) is applied. An equal phase difference between two adjacent channel entries cannot be supported. Thus, $\mathbb {E}\{ [{\textbf {h}}]_{l+m}^{\ast} [{\textbf {h}}]_{l+n}\}$ is dependent on the particular $l$ , $m$ and $n$ , instead of $m$ - $n$ . Spatial nonstationarity of a near-field channel has been mathematically verified in [59] and experimentally observed through measurements in [60] and [61].

SECTION IV.

Low-Cost Extra Large-Aperture Array Architectures

The new channel properties brought by an extra large-aperture array will inform the hardware and transceiver design. The multiantenna arrays used in traditional systems do not have a large size, and a fully digital architecture is widely employed to connect each active antenna with a unique RF chain. However, with the increase of the antenna array size, the fully digital architecture with high resolution will be expensive and not suitable for practical applications. Low-cost architecture designs are of great importance for the commercial deployment of extra large-aperture arrays. Moreover, for an active antenna array, each antenna is driven by a power amplifier (PA) or a low noice amplifier (LNA) and has the ability to transmit and receive wireless signals. Thereafter, the power consumption of an active antenna array is usually large as well. Fortunately, the new channel features provide room for cost reduction. By jointly considering the hardware and power cost as well as the new channel properties, in this section, we will introduce the potential low-cost extra large-aperture array architectures.

A. Active Arrays With Less RF Chains

Research in this type of architectures originates in the beginning of the 5G era [62], [63], [64], [65], [66], [67], [68], [69], [70], [71]. A large array with massive active antennas is controlled by a small amount of RF chains. The numbers of active antennas and RF chains are denoted as $N$ and $N_{\textrm {RF}}$ , respectively, satisfying $N \gg N_{\textrm {RF}}$ . One RF chain is connected to one or multiple antennas and controls them through RF devices, such as phase shifters (PSs) and/or switches. After analog processing, such as analog beamforming, combining, and selection in the RF module, a base band (BB) processing is further applied among the signals on these RF chains. Therefore, a hybrid RF and BB structure is modeled as follows: $\begin{equation*} {\textbf {r}} = {\textbf {F}}_{\textrm {BB}}{\textbf {F}}_{\textrm {RF}} {\textbf {y}}\tag{83}\end{equation*}$ View Source where ${\textbf {y}}\in \mathbb {C}^{N \times 1}$ is the signal received at the antennas, ${\textbf {F}}_{\textrm {RF}}\in \mathbb {C}^{N_{\textrm {RF}} \times N}$ is the RF processing matrix, ${\textbf {F}}_{\textrm {BB}}\in \mathbb {C}^{K \times N_{\textrm {RF}}}$ is the BB processing matrix, $K$ is the number of data streams, and ${\textbf {r}}\in \mathbb {C}^{K \times 1}$ is used for signal detection. The format of ${\textbf {F}}_{\textrm {RF}}$ is determined by the type of connections among the RF chains and antennas as well as the type of RF devices on each connection.

1) Connection Type:

The connection type directly determines the hardware cost, transceiver design, transmission performance, as well as the scalability of the architecture. Generally, there are two main types. One is the single-RF chain single-antenna type, and the other is the single-RF chain multiple-antenna type [69], [70].

Single-RF Chain Single Antenna: When this connection type is adopted, a single RF chain can be only connected to a single antenna. A switch is required at each RF chain to enable antenna selection, that is, to determine whether this RF chain is activated and which antenna it is connected with. If the RF chain is activated, then only one antenna will be connected with it. A total of $N_{\textrm {RF}}$ switches are deployed. No PSs are needed because beamforming is solely implemented at the BB module.
Antenna selection can be further categorized into two types, including full array selection and partial array selection. Full array selection enables an RF chain to connect with any antenna in the array. Partial array selection means that each RF chain can select from a subarray which is physically closest to it. For a certain RF chain, the partial array for antenna selection is usually fixed, and the size of the partial array is determined by the sweeping space of the switch at the RF chain side. Partial arrays corresponding to different RF chains can be disjoint or overlapped. If two RF chains select antennas from a same partial array, then their selection strategy needs to be different.
Single-RF Chain Multiple Antennas: When applying this connection type, a single RF chain can be connected with multiple antennas. Signal combination or beamforming is achieved at the RF module, and then the array gain can be harvested. Most studies focus on this connection type.

Similar to antenna selection, one RF chain can be connected with the full array or a partial array close to it, corresponding to the full array connection structure and the partial array connection structure, respectively. In the full array connection structure, each antenna can be connected with all the RF chains and vice versa. A unique physical link is established between each RF chain and each antenna. In each link, a PS can be deployed at the antenna side to enable analog beamforming, or an ON/OFF switch can be deployed at the antenna side to reduce the cost and achieve a simple signal combination. A total of $N_{\textrm {RF}}N$ PSs or ON/OFF switches are required in the full array connection structure.

In the partial array connection structure, one RF chain can be connected with a proportion of antennas, but one antenna can be connected with only one RF chain. For a certain RF chain, the partial array that can be connected with is fixed or dynamic. In the former case, a physical link exists between the RF chain and each antenna in the partial array. In each link, a PS or an ON/OFF switch can be deployed at the antenna side as well. The size of each partial array or subarray is fixed, and a total of $N$ PSs or ON/OFF switches are required in the fixed subarray structure. In the latter case, apart from these PSs or ON/OFF switches, an extra switch is employed at each antenna to determine which RF chain it will be connected with. Notably, unlike the ON/OFF switch, this switch is used for RF chain selection and it sweeping space covers all the RF chains. No switch is further needed at the RF chain side for antenna selection. The size of each subarray can be adjusted in a real-time manner. This structure is more suitable for extra large-aperture array systems under spatial nonstationarity.

2) Component Type:

Now, we turn our attention on the three component mentioned above, including the PS, the ON/OFF switch, and the switch for selection.

PS: A PS can adjust the phase of an RF signal. It is a key enabler of analog beamforming in multiantenna systems. When PSs are deployed, the RF matrix ${\textbf {F}}_{\textrm {RF}}$ is called the analog beamforming matrix, contributing to the hybrid beamforming structure together with the BB precoding. However, the cost of a PS is analogous to its operating frequency, as well as its resolution.
ON/OFF Switch: An ON/OFF switch can be turned ON or OFF to determine whether the signal can pass through the connection. When a switch is in the physical link between one RF chain and one antenna, this connection can be activated or inactivated by choosing the ON and OFF status, respectively. The cost of an ON/OFF switch is significantly lower than that of a PS, but the insertion loss is a major problem.
Switch for selection: A switch for selection has a sweeping space and can be connected to one of the physical links in this sweeping space. It can be deployed at the RF chain side to achieve antenna selection, or be deployed at the antenna side to make RF chain selection. A switch for selection is more expansive than an ON/OFF switch.

3) State-of-the-Art Architectures:

The various connection types and device types can jointly form many different combinations, each corresponding to a particular architecture. Here, we introduce the architectures that have appeared in existing studies, which are listed in Table II, and make an analysis on their signal model, advantages, and drawbacks.

TABLE II Architectures of Active Arrays With Less RF Chains That Have Appeared in Existing Studies

a) Single-RF chain single antenna in full array selection:

This is the traditional antenna selection architecture as shown in Fig. 9 (i). In this architecture, we have $[{\textbf {F}}_{\textrm {RF}}]_{i,j}\in \{0,1\}$ for $i=1,\ldots,N_{\textrm {RF}}, j=1,\ldots,N$ and $\begin{align*}&0\le \sum _{j=1}^{N}\left [{{\textbf {F}}_{\textrm {RF}}}\right]_{i,j}\le 1 \forall i \\&0\le \sum _{i=1}^{N_{\textrm {RF}}}\left [{{\textbf {F}}_{\textrm {RF}}}\right]_{i,j}\le 1 \forall j.\tag{84}\end{align*}$ View Source Since the sweeping space of a switch is confined, this architecture is widely adopted in traditional multiantenna systems due to the limited array size. However, when employing an extra large-aperture array, it may be impractical to find a switch that could be connected to all antennas in a massive array.

Fig. 9.

Architectures of active arrays with less RF chains that have appeared in existing studies (i)–(vii) versus the proposed double layer architecture (viii).

Show All

b) Single-RF chain single antenna in partial array selection:

This architecture is more easily implemented in an extra large-aperture array system. Considering the scalability issue as well, a subarray-based antenna selection architecture is naturally considered. As shown in Fig. 9 (ii), the entire array is composed of multiple subarrays. Each subarray has completely the same topology, including the number of antennas and the number of RF chains. Denote the number of subarrays as $B$ . Then, each subarray has $({N}/{B})$ antennas and $({N_{\textrm {RF}}}/{B})$ RF chains. One RF chain can select no more than one antenna within the same subarray, and one antenna can be selected by no more than one RF chain within the same subarray. Thus, ${\textbf {F}}_{\textrm {RF}}$ has a block diagonal structure $\begin{align*} {\textbf {F}}_{\textrm {RF}} = \begin{bmatrix} {\textbf {F}}_{{\textrm {RF}},1} & & \\ & \ddots & \\ & & {\textbf {F}}_{{\textrm {RF}},B} \\ \end{bmatrix}\tag{85}\end{align*}$ View Source where ${\textbf {F}}_{{\textrm {RF}},b}\in \{0,1\}^{({N_{\textrm {RF}}}/{B}) \times ({N}/{B})}$ is the RF submatrix of the $b$ th subarray, satisfying $\begin{align*} 0\le&\sum _{j=1}^{\frac {N}{B}}\left [{{\textbf {F}}_{{\textrm {RF}},b}}\right]_{i,j}\le 1 \forall i \\ 0\le&\sum _{i=1}^{\frac {N_{\textrm {RF}}}{B}}\left [{{\textbf {F}}_{{\textrm {RF}},b}}\right]_{i,j}\le 1 \forall j.\tag{86}\end{align*}$ View Source When applying this architecture, multiple identical subarrays can be directly combined together to construct an extra large-aperture array. This scalability facilitates the design, fabrication and production of the array. Moreover, local antenna selection within a single subarray is supported, thereby giving room for complexity reduction. However, the selected antennas are usually discontinuous and cannot cover a continuous VR. Thus, the array gain will be compromised.

c) Single-RF chain multiple antennas in full array connection with PSs:

This is the widely studied full-connection hybrid beamforming architecture in 5G millimeter wave systems [65], [66], [67], [68], [69], [70] and has been considered in the extra large-aperture array system as in [29]. As shown in Fig. 9 (iii), each RF chain is connected with all antennas through PSs. The RF matrix ${\textbf {F}}_{\textrm {RF}}$ has the following format: $\begin{align*} {\textbf {F}}_{\textrm {RF}} = \begin{bmatrix} {\textbf {f}}_{{\textrm {RF}},1}^{T} \\ \vdots \\ {\textbf {f}}_{{\textrm {RF}},N_{\textrm {RF}}}^{T} \\ \end{bmatrix}\tag{87}\end{align*}$ View Source where ${\textbf {f}}_{{\textrm {RF}},i}\in \mathbb {C}^{N \times 1}$ is the analog beamforming vector in the $i$ th RF chain with $[{\textbf {f}}_{{\textrm {RF}},i}]_{j} =e^{j\phi _{i,j}}$ , and $\phi _{i,j}\in [0,2\pi]$ is the phase shift introduced by the PS in the physical link between RF chain $i$ and antenna $j$ . In sparse channel conditions, the performance of this architecture can be very close to that of the fully digital architecture. However, this architecture has the following drawbacks. First of all, both the cost and the energy consumption of $N_{\textrm {RF}}N$ PSs are high. Second, with the increase of the array size, the length of the transmission line that connects the antenna array edges grows, and the transmission latency differs significantly across the array. Then, the synchronization across antennas in an RF chain becomes problematic. Third, this architecture lacks scalability. If the array is expanded and more antennas are added to the array, then an equal amount of components need to be added to each RF chain as well, and hence, the structure of each RF chain will change. Finally, integrating such a large number of PSs in an RF module is difficult. Therefore, this architecture is not recommended for extra large-aperture array systems.

d) Single-RF chain multiple- antennas in full array connection with ON/OFF switches:

This architecture is a reduced version of architecture iii by replacing the expensive PSs with low-cost ON/OFF switches as illustrated in Fig. 9 (iv). The RF matrix ${\textbf {F}}_{\textrm {RF}}$ sustains the format in (87). The difference is that $[{\textbf {f}}_{{\textrm {RF}},i}]_{j}\in \{0,1\}$ for $i=1,\ldots,N_{\textrm {RF}}$ and $j=1,\ldots,N$ . Note that this architecture is not subject to the antenna selection constraints in (84). It can simultaneously achieve antenna selection and dynamic partial array connection. However, it also entails integration, synchronization, and scalability challenges.

e) Single-RF chain multiple antennas in fixed partial array connection with PSs:

This is the well known subarray hybrid beamforming architecture [63], [64], [68], [69], [70]. In Fig. 9 (v), each subarray has equal size with only one RF chain and $({N}/{N_{\textrm {RF}}})$ antennas. The RF matrix ${\textbf {F}}_{\textrm {RF}}$ also has a block diagonal structure $\begin{align*} {\textbf {F}}_{\textrm {RF}} = \begin{bmatrix} {\textbf {f}}_{{\textrm {RF}},1}^{T} & & \\ & \ddots & \\ & & {\textbf {f}}_{{\textrm {RF}},N_{\textrm {RF}}}^{T} \\ \end{bmatrix}\tag{88}\end{align*}$ View Source where ${\textbf {f}}_{{\textrm {RF}},i}\in \mathbb {C}^{({N}/{N_{\textrm {RF}}})\times 1}$ is the RF vector in the $i$ th subarray with $[{\textbf {f}}_{{\textrm {RF}},i}]_{j} =e^{j\phi _{i,j}}$ for $i=1,\ldots,N_{\textrm {RF}}$ and $j=1,\ldots,({N}/{N_{\textrm {RF}}})$ . Given an equal number of RF chains, the performance of this architecture is inferior to that of architecture iii. However, the number of PSs in this architecture is much smaller, thereby the cost is greatly reduced. Moreover, the synchronization problem ceases to exist, since antennas connected with the same RF chain are within a single subarray, whose size is usually limited. Besides, the array size can be easily scaled up by using more subarrays. For all these reasons, this architecture finally managed to earn a commercial deployment opportunity in 5G.

f) Single-RF chain multiple antennas in fixed partial array connection with ON/OFF switches:

This architecture is deduced from architecture ${v}$ by replacing PSs with ON/OFF switches as shown in Fig. 9 (vi). The RF matrix ${\textbf {F}}_{\textrm {RF}}$ still follows the format (88). The difference is that $[{\textbf {f}}_{{\textrm {RF}},i}]_{j}\in \{0,1\}$ for $i=1,\ldots,N_{\textrm {RF}}$ and $j=1,\ldots,({N}/{N_{\textrm {RF}}})$ . The constraints in (86) do not need to be considered. Notably, when the array size is large, the channel power will be concentrated on the VR of the user w.r.t. the extra large-aperture array. Note that the VR caused by an unequal path loss usually covers a continuous part of the array. To increase the energy efficiency, only the continuous part of the array in VR can be turned ON. That is, the effective size of each subarray is dynamic as well. This architecture also has the advantages of easy synchronization and scalability as well as the lowest hardware cost (only $N$ ON/OFF switches), thereby becoming suitable for extra large-aperture array systems [44].

g) Single-RF chain multiple antennas in dynamic partial array connection with PSs:

The concept of a dynamic partial array or dynamic subarray appeared in [71]. It is an improved version of architecture ${v}$ . As the name suggests, segmentation of the subarrays can be flexibly adjusted instead of being fixed. That is, even though ${\textbf {F}}_{\textrm {RF}}$ follows the format in (88), the size of ${\textbf {f}}_{{\textrm {RF}},i}$ varies with $i$ . Each antenna can be flexibly connected to an arbitrary RF chain or be deactivated. This architecture not only harvests the array gain but adjusts the effective subarray sizes based on the real-time channel condition. Equally importantly, when a VR exists, dynamic subarrays are verified to achieve better performance than fixed subarrays [6].

However, this architecture has the following drawbacks. First, it is hard to implement. There are two solutions denoted as architectures vii -1 and vii -2, respectively. The first solution is to deploy a switch for selection at each antenna to select one of the $N_{\textrm {RF}}$ RF chains.1 An extra combiner is actually needed at each RF chain to enable the connection with multiple antennas, as shown in Fig. 9 (vii-1). However, it is difficult to connect such a combiner with $N$ switches for selection at the antenna side. The second solution is to modify architecture iv by adding a PS before each antenna in Fig. 9 (vii-2). However, the integration of ${N}{N_{\textrm {RF}}}$ ON/OFF switches and $N$ PSs and the synchronization among them are challenging. Moreover, the lack of scalability further makes this architecture hard to be deployed in practical extra large-aperture array systems.

4) Proposed Double-Layer Architecture:

Considering the advantage of dynamic subarrays as well as the practical implementation and scalability, in this article, we integrate the full-connection and subarray structures and propose a double-layer architecture, which is refered to as architecture viii. As shown in Fig. 9 (viii), the outer layer follows the fixed subarray structure, and the inner layer follows the dynamic subarray structure. The extra large-aperture array is composed of $B$ physical subarrays. Each physical subarray has the same hardware topology, including $({N_{\textrm {RF}}}/{B})$ RF chains and $({N}/{B})$ antennas. The values of $({N_{\textrm {RF}}}/{B})$ and $({N}/{B})$ are not large. For example, we can let $({N_{\textrm {RF}}}/{B})=4$ and $({N}/{B})=64$ .

Architecture vii is adopted in each physical subarray. For convenient implementation, a physical link is established between each RF chain and each antenna in the common physical subarray. An ON/OFF switch is deployed in each physical link. For a certain antenna, only one RF chain can be selected, and thus no more than one physical link connected with this antenna is finally turned ON. To enable analog beamforming, each antenna is further equipped with a PS. A total of $({N_{\textrm {RF}}N}/{B^{2}})$ ON/OFF switches and $({N}/{B})$ PSs are integrated in a physical subarray.

In the proposed double-layer architecture, the RF matrix ${\textbf {F}}_{\textrm {RF}}\in \mathbb {C}^{N_{\textrm {RF}} \times N}$ follows the block diagonal structure in (85). The submatrix ${\textbf {F}}_{{\textrm {RF}},b}\in \mathbb {C}^{({N_{\textrm {RF}}}/{B}) \times ({N}/{B})}$ has the following format: $\begin{equation*} {\textbf {F}}_{{\textrm {RF}},b} = {\textbf {S}}_{b} \odot \left ({{\textbf {1}}_{\frac {N_{\textrm {RF}}}{B}}\otimes {\textbf {f}}_{b}^{T} }\right)\tag{89}\end{equation*}$ View Source where ${\textbf {S}}_{b}\in \{0,1\}^{({N_{\textrm {RF}}}/{B}) \times ({N}/{B})}$ is the ON/OFF matrix satisfying $\begin{equation*} 0\le \sum _{i=1}^{\frac {N_{\textrm {RF}}}{B}}\left [{{\textbf {S}}_{b}}\right]_{i,j}\le 1 \forall j.\tag{90}\end{equation*}$ View Source The column vector ${\textbf {1}}_{({N_{\textrm {RF}}}/{B})}$ has $({N_{\textrm {RF}}}/{B})$ ones, while ${\textbf {f}}_{b}\in \mathbb {C}^{({N}/{B})\times 1}$ is the phase shifting vector. If the $i$ th RF chain is activated, then it controls an effective subsubarray. The effective subsubarray has a dynamic size, which depends on the number of antennas whose physical links with the $i$ th RF chain are turned ON. Analog beamforming is also supported within the effective subsubarray, and thus the array gain can be harvested.

The proposed double-layer architecture sustains the advantage of easy synchronization and scalability of the subarray structure. Equally importantly, the hardware cost is greatly reduced compared with architecture vii. The insertion loss is substantially mitigated by using much less switches. This architecture also can harvest the full array gain by activating all antennas in a subarray simultaneously. Alternatively, in spatial nonstationary channel conditions, we can only activate the antennas where the biggest proportion of channel power is concentrated in. For the above-mentioned reasons, this is a potential architecture for extra large-aperture arrays.

Table III summarizes the hardware cost, advantages, and disadvantages of the nine architectures, including the two solutions of architecture vii and the proposed architecture viii. Considering the scalability, architectures ii, ${v}$ , vi, and viii are promising in the deployment of an extra large-aperture active array with less RF chains.

TABLE III Comparison of Architectures of Active Arrays With Less RF Chains

B. Reconfigurable Intelligent Surfaces

Another low-cost extra large-aperture array is the RIS [73], [74], [75], [76], [77], which is also known as metasurface [75], [78], [79], or intelligent reflecting surface (IRS) [80]. An RIS is composed of low-cost near passive unit cells, each with independently tunable EM responses controlled by external signals. An incident EM wave can be reflected or refracted by the RIS, or the reflection and the refraction happen simultaneously [81], [82], [83]. An RIS flexibly adjusts the amplitude, phase, or polarization of the incident EM wave in real time. Then, a preferable EM propagation environment can be customized by properly controlling the RIS.

The widely studied category of RISs reflect the EM waves toward the desired directions by adjusting their phases. An RIS works as a controllable reflector in the wireless environment, providing an additional controllable link between the transmitter and the receiver to assist the wireless communication. Suppose the transmitter and the receiver are equipped with a single antenna, respectively. The number of unit cells in the RIS is $N$ . Then, the signal at the receiver can be modeled as follows: $\begin{equation*} r = gs + {\textbf {h}}_{2}^{T} {\boldsymbol{\Lambda }} {\textbf {h}}_{1} s + z\tag{91}\end{equation*}$ View Source where $s$ is the transmitted signal, $g\in \mathbb {C}$ is the direct channel between the transmitter and receiver, ${\textbf {h}}_{1},{\textbf {h}}_{2} \in \mathbb {C}^{N\times 1}$ are the channel between the transmitter and RIS and the channel between RIS and the receiver, respectively, while $\begin{equation*} {\boldsymbol{\Lambda }} = {\textrm {diag}}\left \{{ {\textbf {v}} }\right \},\quad {\textbf {v}} = \left [{e^{j\phi _{1}},\ldots,e^{j\phi _{N}}}\right]^{T}\tag{92}\end{equation*}$ View Source include the phase shift of signal introduced by the RIS, $\phi _{n}$ is the phase shift on the $n$ th unit cell, and $z$ is the complex Gaussian noise. Apart from the direct link $g$ , an RIS link ${\textbf {h}}_{2}^{T} {\boldsymbol{\Lambda }} {\textbf {h}}_{1}$ is added in. If the direct link is blocked by obstacles, then the RIS can reconstruct the wireless link and recover the communication service. The effective channel in an RIS-assisted wireless communication system is $\begin{equation*} g_{\textrm {eff}} = g + {\textbf {h}}_{2}^{T} {\mathbf {\boldsymbol \Lambda }} {\textbf {h}}_{1}.\tag{93}\end{equation*}$ View Source

1) Fully Passive RIS:

Most existing RISs that work in the reflection mode are fully passive regardless of the low external control voltage. No signal processing module exists at the RIS, and, thus, the RIS is not able to transmit or receive wireless signals. Since the individual channels ${\textbf {h}}_{1}$ and ${\textbf {h}}_{2}$ are cascaded together, channel estimation can only be applied at the receiver side. Under this condition, it is convenient to directly estimate the effective channel $g_{\textrm {eff}}$ . Alternatively, by rewriting $\begin{equation*} {\textbf {h}}_{2}^{T} {\mathbf {\boldsymbol \Lambda }} {\textbf {h}}_{1} = {\textbf {h}}_{2}^{T} {\textrm {diag}}\left \{{ {\textbf {v}} }\right \} {\textbf {h}}_{1} = {\textbf {h}}_{2}^{T} {\textrm {diag}}\left \{{ {\textbf {h}}_{1} }\right \} {\textbf {v}}\end{equation*}$ View Source it is feasible to estimate the cascaded channel ${\textbf {h}}_{2}^{T} {\textrm {diag}}\{ {\textbf {h}}_{1} \}$ . The estimate of ${\textbf {h}}_{2}^{T} {\textrm {diag}}\{ {\textbf {h}}_{1} \}$ can further guide the design of ${\textbf {v}}$ . However, the training overhead required to estimate ${\textbf {h}}_{2}^{T} {\textrm {diag}}\{ {\textbf {h}}_{1} \} \in \mathbb {C}^{1\times N}$ at the single-antenna receiver is large. Therefore, the fully passive RIS faces intrinsic difficulties in channel estimation.

2) Semi-Passive RISs:

To tackle the channel estimation problem, semi-passive RISs were proposed in [84], [85], [86], and [87]. As shown in Fig. 10, a semi-passive RIS introduces a few active sensors that can receive signals to enable channel estimation at the RIS. These active sensors are connected with RF chains and have two modes. One is the reflection mode, same as a common RIS unit cell. The other is the reception mode, in which the incident signals are received and conveyed to the signal processing module through RF chains. Suppose $\bar N$ unit cells are active sensors, satisfying $1\le \bar N \le N$ . Under this condition, the two individual channels from the transmitter and from the receiver to RIS, denoted by $\bar {\textbf {h}}_{1}\in \mathbb {C}^{\bar N\times 1}$ , and $\bar {\textbf {h}}_{2}\in \mathbb {C}^{\bar N\times 1}$ , respectively, can be estimated at the RIS. By leveraging the sparsity of channels and the correlation among different unit cells, the large-dimensional channels ${\textbf {h}}_{1}$ and ${\textbf {h}}_{2}$ can be extrapolated from their reduced dimensional versions $\bar {\textbf {h}}_{1}$ and $\bar {\textbf {h}}_{2}$ when the channel experiences spatial stationarity. In practice, one of the transmitter and receiver can be the BS or access point. Considering that the locations of BS/AP and RIS are fixed, the channel between them (denoted as ${\textbf {h}}_{1}$ ), remains unchanged within a long time period. Therefore, ${\textbf {h}}_{1}$ does not need to be frequently estimated, saving a great amount of training overhead. However, the channel extrapolation method may not work well in frequency-division duplexing systems, where reciprocity does not hold between the uplink and downlink channels.

Fig. 10.

Semi-passive RIS. Unit cells in blue are passive and only have the reflection mode with phase shift capability. Unit cells in red are active and have the reflection and receiving modes.

Show All

The above low-cost architectures enable the deployment of extra large-aperture arrays. Active antenna arrays and RISs can be jointly applied to satisfy specific service requirements in different application scenarios.

SECTION V.

Low-Complexity Processing and Computation

Apart from the problem of high cost, the implementation of an extra large-aperture array also requires high-complexity processing and computations. In multiantenna systems, the computational complexity of the widely used linear signal processing algorithms usually has an order of $\mathcal {O}(N)$ , where $N$ is the number of antennas. If a matrix multiplication or inversion is further involved, then the order of computational complexity grows. When $N$ grows large, the complexity of these algorithms that jointly process signals across all antennas will grow explosively. The high-complexity processing and computations usually result in unacceptably high latency. The centralized control over the entire extra large-aperture array requires an extremely powerful central process unit (CPU) as shown in Fig. 11 (a). Therefore, in extra large-aperture array systems, low-complexity processing and computation design is also a key objective.

Fig. 11.

Extra large-scale array is controlled by (a) single CPU, (b) multiple LPUs, or (c) CPU and multiple LPUs.

Show All

A. Complexity Reduction at CPU

One method is to directly reduce the complexity of some high-complexity algorithms for their simplified or scalable implementation in the CPU. Complexity reduction in massive MIMO systems is not a novel concept [88], [89], [90], [91], [92]. Some of these methods can be extended to fit in extra large-aperture array systems.

There have been studies focusing on the complexity reduction in the CPU of extra large-aperture array systems [36], [40], [93], [94], [95]. Most of these studies focus on zero-forcing (ZF), which is a widely used linear signal processing method in multiuser multiantenna systems. The ZF precoder and combiner can be applied at the transmitter and the receiver, respectively, to cancel out the interuser interference. For example, let us denote the downlink channel between the extra large-aperture array at the BS and the single-antenna user $k$ as ${\textbf {h}}_{k} \in \mathbb {C}^{N\times 1}$ . The channels of $K$ users are stacked together as ${\textbf {H}} = [{\textbf {h}}_{1},\ldots, {\textbf {h}}_{K}] \in \mathbb {C}^{N\times K}$ . Then, the ZF precoder is calculated as follows: $\begin{equation*} {\textbf {W}}_{\textrm {ZF}} = {\textbf {H}} \left ({{\textbf {H}}^{H}{\textbf {H}}}\right)^{-1} \in \mathbb {C}^{N\times K}.\tag{94}\end{equation*}$ View Source Since matrix multiplication and inversion are involved, the computational complexity of calculating ${\textbf {W}}_{\textrm {ZF}}$ reaches $\mathcal {O}(N K^{2})+\mathcal {O}(K^{3})$ . To reduce the complexity, Ribeiro et al. [93] proposed a double-layer precoding method. The inner-layer precoder is applied to a group of users that share a similar elevation angle. The outer-layer precoder decreases the interference among different user groups. For each user group, channels on a column of antennas are summed up for the calculation of the inner and outer layer precoders. Suppose $\bar {\textbf {h}}_{k} \in \mathbb {C}^{N_{h} \times 1}$ , whose $i$ th entry represents the sum of channels on the $i$ th column of antennas. Then, complexity reduction is achieved by utilizing the low-dimensional $\bar {\textbf {h}}_{k}$ instead of the extra large-dimensional ${\textbf {h}}_{k}$ . Another example in [40] focused on the acceleration of the calculation of the ZF combiner at the receiver. The algorithm acceleration problem was addressed from the perspective of linear equation systems and addressed by the randomized Kaczmarz (RK) algorithm.

In addition to the ZF receiver, variational message passing (VMP) is another widely used multiuser MIMO detector, which has lower complexity than ZF because no matrix inversion is involved. In this context, Amiri et al. [36] applied VMP in the extra large-aperture array system under spatial nonstationary channel conditions, and further utilized a maximal ratio combiner (MRC) for initialization. The complexity of VMP and MRC is linear with $N$ and $K$ , which is much smaller than that of ZF.

Some other works focused on the complexity reduction of antenna selection [94] and user scheduling [95] in extra large-aperture array systems. Given the number of antennas $N$ and the number of RF chains $N_{\textrm {RF}}$ , the exhaustive searching-based antenna selection method requires a search over $\mathcal {O}(N^{N_{\textrm {RF}}})$ combinations of antennas and RF chains, which is unacceptably high in extra large-aperture array systems. To reduce the complexity, [94] proposed a suboptimal method, which initially sets a coarse antenna selection result and then iteratively refines it based on a closed-form analytical expression of the energy efficiency, effectively avoiding the exhaustive search over a huge combination set. Moreover, when the EM wave experiences spherical propagation, then the channel is reconstructed by both the distance and the angle of the incident signal. Based on this channel feature, [95] introduced an effective distance between a user and the extra large-aperture array and then proposed a low-complexity user scheduling scheme that simply compares the effective distances of different users, making the scheduling problem simple and easy.

B. Distributed Processing and Computation

Assigning all the processing and computation tasks to a single CPU is not a reasonable choice in the extra large-aperture array system. An alternative is to partition the entire array into multiple subarrays and distribute the tasks to the subarrays [6], [34], [37], [38], [39], [41], [42], [43], [96]. This is a logical concept of subarray different from the physical subarray above. A logical subarray may have a fully digital physical architecture, but it has its own local processing unit (LPU) as shown in Fig. 11 (b) and (c). Some processing and computation tasks of an individual logical subarray, such as channel estimation, antenna selection, etc., can be handled by its own LPU. When LPUs exist, there can be arranged via two logical architectures.

1) Single Layer With LPUs:

This logical architecture is illustrated in Fig. 11 (b) and solely composed of LPUs. That is to say, all the processing and computation tasks are distributed and performed at the LPUs, without a centralized control over the LPUs. Since no CPU exists, this architecture can be easily scaled up.

Notably, some tasks are local tasks and can be handled by a single LPU. A typical example of a local task is channel estimation. The channel across the entire array can be uniformly partitioned into $B$ subchannels, i.e., ${\textbf {h}} = [{\textbf {h}}_{1}^{T},\ldots, {\textbf {h}}_{B}^{T}]^{T}$ , where ${\textbf {h}}_{b}\in \mathbb {C}^{ ({N}/{B})\times 1}$ is the subchannel on subarray $b$ . The estimation of ${\textbf {h}}_{b}$ can be independently performed by LPU $b$ based on the pilots received by subarray $b$ , without the cooperation with other LPUs [39]. When linear channel estimation is performed, the complexity of estimating ${\textbf {h}}_{b}$ is $\mathcal {O}({N}/{B})$ , significantly lower than $\mathcal {O}(N)$ of estimating ${\textbf {h}}$ . Denote the estimation result as $\hat {\textbf {h}}_{b}$ . The final channel estimation result across the entire array can be obtained by simply stacking the subchannel estimates together, that is, $\hat {\textbf {h}} = [\hat {\textbf {h}}_{1}^{T},\ldots, \hat {\textbf {h}}_{B}^{T}]^{T}$ .

Most of the tasks are global tasks that require the cooperation among LPUs. A typical example is signal detection. We write the uplink signal model in a time-division duplexing system as follows: $\begin{equation*} {\textbf {y}} = \sum _{k=1}^{K} {\textbf {h}}_{k} s_{k} + {\textbf {n}}\tag{95}\end{equation*}$ View Source where $s_{k}$ is the transmit signal from user $k$ . The task of signal detection is to estimate ${\textbf {s}}=[s_{1},\ldots, s_{K}]^{T}$ from the ${\textbf {y}}\in \mathbb {C}^{N\times 1}$ , which is the signal received by the entire array. Denote ${\textbf {y}} = [{\textbf {y}}_{1}^{T},\ldots, {\textbf {y}}_{B}^{T}]^{T}$ , where ${\textbf {y}}_{b}\in \mathbb {C}^{({N}/{B})\times 1}$ is the signal received by subarray $b$ . If LPU $b$ independently performs signal detection based on ${\textbf {y}}_{b}$ , then there is a high probability that different LPUs provide different estimates of ${\textbf {s}}$ . This is because the channel vector ${\textbf {h}}_{b}$ and the random noise ${\textbf {n}}_{b}$ vary across different $b$ , especially in multipath propagation scenarios and when spatial nonstationarity exists. Considering that only one final detection result is required, while the CPU that can make the final decision is absent, a serial detection method was proposed in [41]. VMP is normally combined with belief propagation for multiuser data detection. The output of LPU $b< B$ is the soft information of ${\textbf {s}}$ and serves as an input of LPU $b+1$ . The outputs of LPU $B$ are the estimates of ${\textbf {s}}$ and serve as the final detection result. The serial cooperation among the LPUs brings the benefit of easy scalability, but still suffers from the high processing latency. Moreover, the working procedure among the LPUs is fixed and cannot be flexibly adjusted according to practical channel conditions.

2) Double Layers With CPU and LPUs:

A more reasonable and widely studied logical architecture is the double-layer architecture with LPUs in the lower layer and CPU in the upper layer as shown in Fig. 11 (c). When spatial nonstationarity holds, different users have different VRs w.r.t. the array. If subarray $b$ is not in the VR of user $k$ , then the CPU can inform LPU $b$ to deactivate the processing and computation related to user $k$ . Therefore, a more efficient transceiver design can be deployed at the CPU, thereby enabling complexity reduction.

In this architecture, each LPU is connected with the CPU. Having completed the distributed processing and calculation, each LPU feeds its local result back to the CPU. Then, the CPU integrates the local results from all the LPUs and obtains the final global result by means of hard decision or data fusion [6]. At the receiver, [37] decentralizes the RK-ZF algorithm and applies it in multiuser signal detection in extra large-scale MIMO systems. LPU $b$ calculates its local linear combiner matrix ${\textbf {V}}_{b}\in \mathbb {C}^{K\times ({N}/{B})}$ , applies it on the received signal on subarray $b$ , and computes the estimate of $\bf s$ at LPU $b$ as follows: $\begin{equation*} \hat {\textbf {s}}_{b} = {\textbf {V}}_{b}{\textbf {y}}_{b}.\tag{96}\end{equation*}$ View Source If the VR of user $k$ does not cover subarray $b$ , then the entries in the $k$ th row of ${\textbf {V}}_{b}$ are zero. Thereafter, $\hat {\textbf {s}}_{b}$ is sent to the CPU. Having received all the estimates from $B$ LPUs, the CPU integrates $\hat {\textbf {s}}_{1},\ldots, \hat {\textbf {s}}_{B}$ and makes the final decision through data fusion. Similarly, [42] and [47] applied VMP for multiuser signal detection, and LPU $b$ outputs the symbol probability $q_{b}(\bf s)$ instead of the estimate $\hat {\textbf {s}}_{b}$ . The estimates of multiuser signals are only obtained at the CPU.

The concept of LPUs of subarrays can be extended to LPUs of users. In [38], transmit antenna selection and user mapping were studied. Considering that different users have unequal VRs, parallel user mapping convolutional neural networks (CNNs) were proposed to learn the selected antennas for each user independently. The $k$ th CNN outputs $N_{{\mathrm {max}}}$ antennas for user $k$ . The CPU further makes antenna selection from the $N_{{\mathrm {max}}}$ antennas for user $k$ by jointly considering the sum-rate of all $K$ users. In the above studies [6], [37], [38], [42], [47], the LPUs work independently in parallel, and information exchange between one LPU and the CPU occurs only once. Therefore, the working procedure has relatively low latency.

Some recent works proposed the information exchange among LPUs or iterations between CPU and LPUs to gradually improve the performance. Information exchange between two distinct LPUs can be achieved with the assistance of CPU, or, a direct connection can be further established between the two LPUs. At the receiver, the LPUs in [34] performed ZF-based signal detection on a per user basis, while the detection results of a certain user were shared by the LPUs for the detection of signal from the next user. This serial interference cancelation method was also applied in [47], where VMP is employed in each LPU. Notably, given the VR of each user, the operation order of LPUs as well as the detection order of user signals can be initially determined by CPU [34], which further improves the detection performance.

Apart from ZF and VMP, expectation propagation (EP) is another effective algorithm that has been utilized at the receiver for multiuser signal detection in extra large-aperture array systems [97], [98]. EP in a centralized processing strategy that has excellent performance and moderate complexity. In this context, [97] initially implemented EP in a decentralized manner and made efforts on the reduction of computational complexity and information exchange amount, while [98] further refined the decentralized EP by approximating the matrix inversion at the CPU, whose complexity is $\mathcal {O}(K^{3})$ , with a polynomial expansion. Given that EP is an iterative algorithm, the decentralized EP method also requires information exchange among the CPU and the LPUs.

In [96], antenna selection and resource allocation were considered at the downlink transmitter. Even though in this work the LPUs operate in parallel, back-and-force information exchange between CPU and LPUs occurs since a genetic algorithm was adopted. Successive operation of LPUs and iterative optimization between two layers inevitably increase the latency.

Multilayer processing can be further applied in extra large-aperture RIS-assisted mobile communication systems [43]. The RIS can be uniformly partitioned into $B$ logical subarrays, corresponding to the lowest processing layer. In the design of the RIS reflection codebook, a reduced dimensional local subcodebook can be first designed for each subarray. Then, subcodebooks in the second lowest layer is obtained from the ones in the lowest layer. Through this sequential design, the fully dimensional codebook can be finally derived in the higher layer. The multilayer processing reduces not only the complexity, but the huge training overhead caused by the extra large-aperture RIS.

SECTION VI.

Low-Overhead Communication and Sensing

In this section, we focus on low-overhead design in extra large-aperture array systems. Training is an effective and reliable approach to acquire CSI. With the increase of user equipments and the diversification of device types that are connected to the extra large-aperture array system, the amount of pilots required will grow prohibitively high if independent training is performed across them. Furthermore, for an extra large-aperture array with massive active antennas but less RF chains, estimation of the huge dimensional channel on each antenna inevitably involves a beam sweeping or antenna switching process, which will be time consuming if the number of RF chains is much smaller than the number of active antennas.

Fortunately, the directionality and sparsity of propagation channels create room for overhead reduction, which will be explained in detail in the following part of this section. Furthermore, the extra large-aperture array has an extremely high spatial resolution, and the high-dimensional channel contains the environment information, such as knowledge about the user location and surrounding obstacles. Therefore, sensing can be achieved together with communication during the training process [99]. In this section, we study the low-overhead communication and sensing paradigm.

A. Directionality and Channel Sparsity

In a traditional multiantenna system, the serving area of a BS is large, and users are in the far-field region of the array. The plane wave channel model (60) is then applied, and the plane wave is expressed by its AoA/AoD as shown in (61). Due to the high spatial resolution of the large-aperture array, and the much smaller number of propagation paths than the number of antennas, the channel shows significant sparsity and directionality in the angular domain. In an extra large-aperture array system, there is a high probability that the distance between a user and the BS is smaller than the Rayleigh distance. Under these conditions, the spherical wave channel model (54) should be introduced, and the spherical wave is expressed by the position of the source (28). Moreover, the VR kicks in when blockage exists, which means that the effective array size is reduced. Then, whether the channel sparsity and directionality hold becomes a question.

Assume the BS is equipped with an extra large-aperture uniform linear array (ULA) with $N$ elements lying on the $x$ -axis, where $N$ is even for the simplification of analysis. Considering that the horizontal ULA has flexible control over only the $xz$ plane, and we describe the positions through $(x,z)$ coordinates. The center of the ULA is at the origin of the coordinate system, and the position of antenna $n$ is $(-([{2n+1}]/{2})d,0)$ , where $n=-({N}/{2}),\ldots,({N}/{2})-1$ . User $k$ is located at ${\textbf {s}}_{k}=(x_{k},z_{k})$ . By applying the limited dimensional channel model (78) and assume that only the LoS path exists, the channel between the BS and user $k$ can be simplified as follows: $\begin{equation*} {\textbf {h}}_{k} = \beta _{k} {\textbf {a}}\left ({{\textbf {s}}_{k}}\right) \odot {\textbf {p}}\left ({\Phi _{k}}\right)\tag{97}\end{equation*}$ View Source where ${\textbf {a}}({\textbf {s}})\in \mathbb {C}^{N\times 1}$ is the steering vector, satisfying $\begin{equation*} \left [{{\textbf {a}}\left ({{\textbf {s}}}\right)}\right]_{n} = \frac {\lambda }{4\pi d_{k,n} } e^{-j \frac {2\pi }{\lambda } d_{k,n} }.\tag{98}\end{equation*}$ View Source When applying (12) $\begin{equation*} d_{k,n}=\sqrt {\left ({x_{k}+\frac {2n+1}{2}d}\right)^{2}+z_{k}^{2}}\tag{99}\end{equation*}$ View Source is the distance between the source and antenna $n$ , $\Phi _{k}$ is the VR of the user w.r.t. the array, and ${\textbf {p}}(\Phi)$ follows the structure in (79).

1) Angular Domain:

We start by investigating whether the directionality and sparsity hold for ${\textbf {h}}_{k}$ in the angular domain when the VR covers the entire array. The angular domain transformation is derived from the plane wave model where equal phase deviation is experienced by each pair of adjacent antennas as shown in (61). Therefore, the discrete Fourier transformation (DFT) matrix is usually adopted as the angular domain transformation matrix. Denote the $N$ -dimensional DFT matrix as ${\textbf {U}}_{A}\in \mathbb {C}^{N\times N}$ , where $[{\textbf {U}}_{A}]_{n_{1},n_{2}}=e^{-j2\pi ({n_{1}}/{N})n_{2}}$ , $n_{1}, n_{2} = 0,\ldots,N-1$ . The $n$ th row corresponds to the direction with an angle of $\theta ={{\mathrm {arccos}}}({n}/{N})$ . The rows of ${\textbf {U}}_{A}$ are orthogonal with each other. Then, the angular domain channel of user $k$ is written as $\tilde {\textbf {h}}_{A,k} = {\textbf {U}}_{A}{\textbf {h}}_{k}$ , where $\tilde {\textbf {h}}_{A,k}\in \mathbb {C}^{N\times 1}$ has the same dimension with ${\textbf {h}}_{k}$ . Under the plane wave model, the amplitude of $[\tilde {\textbf {h}}_{A,k}]_{n}$ will be large if the angle of the LoS path is close to ${{\mathrm {arccos}}}({n}/{N})$ , and, thus, $\tilde {\textbf {h}}_{A,k}$ would have a sparse pattern. However, under the spherical wave model, the entire array does not experience a common angle, and a significant angular spread appears. As shown in Fig. 12 (a), $\tilde {\textbf {h}}_{A,k}$ shows directionality around $\cos \theta =0$ when $z_{k}=5000\lambda$ . With the decrease of $z_{k}$ , and wherever user $k$ moves toward the array, the angular spread increases, and $\tilde {\textbf {h}}_{A,1}$ has more continuous nonzero entries than $\tilde {\textbf {h}}_{A,2}$ and $\tilde {\textbf {h}}_{A,3}$ . In an extreme but unpractical case that $z_{k}=0$ , the angular spread will cover the entire angle value region, and then directionality and sparsity no longer exist in $\tilde {\textbf {h}}_{A,k}$ .

$Fig. 12. - Directionality and sparsity of channels in (a) angular, (b) Cartersian, and (c) polar domains, respectively, when $N=1024$ and $\lambda =0.01$ m. Users 1, 2, and 3 are located at $(0,50\lambda)$ , $(0,500\lambda)$ , and $(0,5,000\lambda)$ , respectively, and their VRs cover the entire array. (a) Illustrates the normalized amplitudes of vectors $\tilde {\textbf {h}}_{A,k}, k=1,2,3$ . (b) and (c) Show the normalized amplitudes of matrices $\tilde {\textbf {H}}_{C,k}$ and $\tilde {\textbf {H}}_{P,k}, k=1,2,3$ .$

Fig. 12.

Directionality and sparsity of channels in (a) angular, (b) Cartersian, and (c) polar domains, respectively, when $N=1024$ and $\lambda =0.01$ m. Users 1, 2, and 3 are located at $(0,50\lambda)$ , $(0,500\lambda)$ , and $(0,5,000\lambda)$ , respectively, and their VRs cover the entire array. (a) Illustrates the normalized amplitudes of vectors $\tilde {\textbf {h}}_{A,k}, k=1,2,3$ . (b) and (c) Show the normalized amplitudes of matrices $\tilde {\textbf {H}}_{C,k}$ and $\tilde {\textbf {H}}_{P,k}, k=1,2,3$ .

Show All

2) Cartesian Domain:

From (98), we see that $[{\textbf {a}}({\textbf {s}})]_{n}$ is determined by the 2-D Cartesian coordinate $(x_{k},z_{k})$ , instead of a 1-D angle $\theta$ . Therefore, under the spherical wave model, it is more reasonable to transform ${\textbf {h}}_{k}$ to a 2-D domain than to a 1-D domain. Paper [39] proposed to transform the radio channel to the Cartesian domain. The transformation matrix ${\textbf {U}}_{c}\in \mathbb {C}^{N_{c}\times N}$ is composed of $N_{c}$ row vectors of $([{{\textbf {a}}^{H}(\bar {x}, \bar {z})}]/{\|{\textbf {a}}(\bar {x}, \bar {z})\|})$ , where $\bar {x}$ and $\bar {z}$ are the samples of $x$ and $z$ , respectively, and $N_{c}$ is the number of sample pairs $(\bar {x},\bar {z})$ .

Let $N_{c}=N_{x}N_{z}$ , where $N_{x}$ and $N_{z}$ are the numbers of $x$ and $z$ samples, respectively, by uniformly and separately sampling $x$ and $z$ as follows: $\begin{align*}&\left \{{ \bar {x} = x_{{\mathrm {min}}}, x_{{\mathrm {min}}}+\Delta x,\ldots,x_{{\mathrm {max}}} }\right. \\&\quad \left.{ \bar {z} = z_{{\mathrm {min}}}, z_{{\mathrm {min}}}+\Delta z,\ldots,z_{{\mathrm {max}}} }\right \}\tag{100}\end{align*}$ View Source where $x_{{\mathrm {min}}}, x_{{\mathrm {max}}}, z_{{\mathrm {min}}}$ , and $z_{{\mathrm {max}}}$ jointly define the rectangular region that users may appear in, while $\Delta x$ and $\Delta z$ are the sampling steps in the $x$ and $z$ axis, respectively. Different from ${\textbf {U}}_{A}$ , the orthogonality among the rows of ${\textbf {U}}_{c}$ cannot be guaranteed. The channel in the Cartesian domain is obtained by $\tilde {\textbf {h}}_{C,k} = {\textbf {U}}_{c}{\textbf {h}}_{k}$ , whose dimension is $N_{x}N_{z}$ , i.e., not equal to that of ${\textbf {h}}_{k}$ . The $N_{x}N_{z}$ -dimensional vector $\tilde {\textbf {h}}_{C,k}$ can be rearranged to an $N_{x} \times N_{z}$ -dimensional matrix $\tilde {\textbf {H}}_{C,k}$ . Fig. 12 (b) illustrates the normalized amplitudes of the 2-D matrices $\tilde {\textbf {H}}_{C,k}, k = 1,2,3$ . For the sample pair which satisfies $(\bar {x},\bar {z}) = (x_{k},z_{k})$ , the corresponding entry of $\tilde {\textbf {H}}_{C,k}$ has the largest amplitude as expected, demonstrating the directionality in the Cartesian domain. When $z_{k}=50\lambda$ , even though most entries of $\tilde {\textbf {H}}_{C,k}$ are nonzero, their amplitudes are still obviously lower than the maximal one. With the increase of $z_{k}$ , the number of nonzero entries decreases. The sparsity of $\tilde {\textbf {H}}_{C,k}$ gradually becomes significant and can be found solely in the $x$ -domain.

3) Polar Domain:

The spherical wave channel is more frequently expressed by the polar coordinates $(D_{k},\theta _{k})$ , where $D_{k}$ and $\theta _{k}$ represent the distance and angle between the ULA’s center and user $k$ , respectively, satisfying $\begin{align*} D_{k}=&\sqrt {x_{k}^{2}+z_{k}^{2}},\quad \theta _{k} = \arcsin {\frac {x_{k}}{D_{k}}} \\ x_{k}=&D_{k}\sin \theta _{k},\quad z_{k} = D_{k}\cos \theta _{k}.\tag{101}\end{align*}$ View Source Then, $d_{k,n}$ in (99) is calculated by $\begin{equation*} d_{k,n}=\sqrt {D_{k}^{2}+\frac {(2n+1)^{2}}{4}d^{2}+(2n+1)dD_{k}\sin \theta _{k}}.\tag{102}\end{equation*}$ View Source The polar transformation matrix can be defined as ${\textbf {U}}_{P}\in \mathbb {C}^{N_{P} \times N}$ with row vectors of $([{{\textbf {a}}^{H}(\bar {D}\sin \bar {\theta }, \bar {D}\cos \bar {\theta })}]/ {\|{\textbf {a}}(\bar {D}\sin \bar {\theta }, \bar {D}\cos \bar {\theta })\|})$ , where $\bar {D}$ and $\bar {\theta }$ are samples of $D$ and $\theta$ , respectively, and $N_{P}$ is the number of sample pairs $(\bar {D},\bar {\theta })$ .

Similar to the Cartesian domain samples, we can let $N_{P}=N_{D} N_{\theta }$ by taking $N_{D}$ samples of $D$ and $N_{\theta }$ samples of $\theta$ independently, but choose $\begin{align*}&\left \{{ \lg \bar {D} = \lg D_{{\mathrm {min}}}, \lg D_{{\mathrm {min}}}+\Delta D,\ldots,\lg D_{{\mathrm {max}}} }\right. \\&\quad \left.{ \bar {\theta } = \theta _{{\mathrm {min}}}, \theta _{{\mathrm {min}}}+\Delta \theta,\ldots,\theta _{{\mathrm {max}}} }\right \}\tag{103}\end{align*}$ View Source where $D_{{\mathrm {min}}}, D_{{\mathrm {max}}}, \theta _{{\mathrm {min}}}$ , and $\theta _{{\mathrm {max}}}$ define the fan-shaped region that users may appear in, and $\Delta D$ and $\Delta \theta$ are the sampling steps of $\lg D$ and $\theta$ , respectively. Here, $\lg D$ instead of $D$ is uniformly sampled. This is because with the increase of $D$ , the spherical wave channel becomes less sensitive to $D$ , and, thus, the sampling interval of $D$ can grow with $D$ . Similar to ${\textbf {U}}_{c}$ , the orthogonality among different rows of ${\textbf {U}}_{P}$ cannot be guaranteed as well. Thereafter, we obtain the polar domain channel as $\tilde {\textbf {h}}_{P,k} = {\textbf {U}}_{P}{\textbf {h}}_{k}$ , whose dimension is $N_{D} N_{\theta }$ . Similarly, the $N_{D} N_{\theta }$ -dimensional vector $\tilde {\textbf {h}}_{P,k}$ can be rearranged to an $N_{D} \times N_{\theta }$ -dimensional matrix $\tilde {\textbf {H}}_{P,k}$ As shown in Fig. 12 (c), the entry corresponding to $(\bar {D},\bar {\theta }) = (D_{k},\theta _{k})$ has the maximal amplitude, verifying the directionality in the polar domain. Moreover, even though the sparsity of the channel in polar domain is not obvious when $D$ is small, the amplitudes of nonzero entries are definitely much lower than the maximum one. The sparsity gets apparent with the increase of $D$ , and is shown only in the angular domain when $D=5000\lambda$ .

To decrease the correlation among rows of ${\textbf {U}}_{P}$ , a joint angle and distance sampling grid was proposed in [29], where $\theta$ is uniformly sampled with $N_{\theta }=N$ and $\Delta \theta =({\pi }/{N})$ . Specifically, the sampling of the distance depends on that of the angle. For a particular sample of angle $\bar {\theta }$ , we acquire a unique sample set of $\bar {D}$ , where the obtained vectors of $([{{\textbf {a}}^{H}(\bar {D}\sin \bar {\theta }, \bar {D}\cos \bar {\theta })}]/{\|{\textbf {a}}(\bar {D}\sin \bar {\theta }, \bar {D}\cos \bar {\theta })\|})$ are nearly orthogonal to each other. To achieve this near orthogonality, the size of the distance sample set varies with the value of $\bar {\theta }$ . When $\cos \bar {\theta }$ approaches 0, the sample set of the distance is expanded. Otherwise, the size of the sample set of distance decreases, resulting in an insufficient sampling grid of the entire space. Despite this drawback, the row vectors of ${\textbf {U}}_{P}$ are approximately orthogonal to each other under this setting, and the polar domain channel $\tilde {\textbf {h}}_{P,k}$ shows sparsity.

4) Antenna Domain:

When the user is very close to the array as the example in Fig. 5, or severe blockage happens as illustrated in Fig. 6, the VR of the user w.r.t. the array is a small-scale subset of antennas in the array. Then, the channel shows sparsity in the antenna domain. In the simplest case that the VR of user $k$ is a continuous subarray, the channel can be approximated as follows: $\begin{align*} {\textbf {h}}_{k} \approx \left [{ \begin{matrix} {\textbf {0}}\\ {\textbf {h}}_{k,VR}\\ {\textbf {0}} \end{matrix} }\right]\tag{104}\end{align*}$ View Source where ${\textbf {h}}_{k,VR}$ is the subvector of ${\textbf {h}}_{k}$ corresponding to the entries within the VR.

B. Low-Overhead Design

Channel directionality and sparsity in the transformation domains provide room for overhead reduction. More particularly, channel directionality guarantees the accuracy of user localization, which further supports channel reconstruction and sensing. Channel sparsity enables the application of compressed sensing techniques in the estimation of channels and the orthogonal transceiver design among multiple users. Details are given as follows.

Consider an extra large-aperture array system with less RF chains than active antennas at the BS. The spatially nonstationary channel ${\textbf {h}}_{k}$ follows the limited dimensional model in (77) and can be rewritten as follows: $\begin{equation*} {\textbf {h}}_{k} = \sum _{l=1}^{L_{k}} \beta _{k,l} {\textbf {a}}\left ({{\textbf {s}}_{k,l}}\right) \odot {\textbf {p}}\left ({\Phi _{k,l}}\right)\tag{105}\end{equation*}$ View Source where $L_{k}$ is the number of paths in the channel of user $k$ , while ${\textbf {s}}_{k,l}$ and $\Phi _{k,l}$ are determined by the scatterers, reflectors, and obstacles in the environment. In the uplink training phase, user $k$ transmits a pilot sequence to the BS for channel estimation and sensing. The received pilot sequence at the BS at time instance $t$ is expressed as follows: $\begin{equation*} {\textbf {Y}}_{t} = \sqrt {P} {\textbf {F}}_{{\textrm {RF}},t}\sum _{k=1}^{K} {\textbf {h}}_{k} {\textbf {x}}_{k}^{H} + {\textbf {F}}_{{\textrm {RF}},t}{\textbf {N}}_{t}\tag{106}\end{equation*}$ View Source where ${\textbf {Y}}_{t}\in \mathbb {C}^{N_{\textrm {RF}}\times Q}$ is the received pilot sequence with length $Q$ on $N_{\textrm {RF}}$ RF chains at time instance $t$ , $P$ is the transmit power of each user, ${\textbf {F}}_{{\textrm {RF}},t}\in \mathbb {C}^{N_{\textrm {RF}}\times N}$ is the RF matrix at time instance $t$ , ${\textbf {x}}_{k}\in \mathbb {C}^{Q\times 1}$ is the pilot sequence of user $k$ satisfying ${\textbf {x}}_{k}^{H}{\textbf {x}}_{k}=1$ and ${\textbf {x}}_{k}^{H}{\textbf {x}}_{j}=0, j\ne k$ , ${\textbf {N}}_{t}\in \mathbb {C}^{N\times Q}$ is the noise matrix with i.i.d. entries, with entry following a complex Gaussian distribution with zero mean and unit variance. A total of $T$ time instances are used for uplink pilot transmission. By stacking ${\textbf {Y}}_{t}, t= 1,\ldots, T$ together and multiplying them with ${\textbf {x}}_{k}$ , we have $\begin{equation*} {\textbf {y}}_{k} = \sqrt {P} {\textbf {F}} {\textbf {h}}_{k} + {\textbf {n}}_{k}\tag{107}\end{equation*}$ View Source where ${\textbf {y}}_{k}\in \mathbb {C}^{N_{\textrm {RF}}T\times 1}$ , ${\textbf {F}}=[{\textbf {F}}_{{\textrm {RF}},1}^{H},\ldots,{\textbf {F}}_{{\textrm {RF}},T}^{H}]^{H}$ , and ${\textbf {n}}_{k}=[({\textbf {F}}_{{\textrm {RF}},1}{\textbf {N}}_{1}{\textbf {x}}_{k})^{H},\ldots,({\textbf {F}}_{{\textrm {RF}},T}{\textbf {N}}_{T}{\textbf {x}}_{k})^{H}]^{H}$ . The independent linear estimation of the channel on each antenna requires $T=({N}/{N_{\textrm {RF}}})$ . Then, the value of $T$ will be large if $N\gg N_{\textrm {RF}}$ , resulting in a huge amount of training overhead.

1) Localization Based on Directionality:

When an LoS path exists between user $k$ and the BS, it is usually set as $l=1$ in (105), and then ${\textbf {s}}_{k,1}$ is the position of user $k$ . The LoS path has stronger power than other NLoS components due to the smallest pathloss. Given the directionality of the near-field channel in Cartesian and polar domains, the matching method of [39] can be applied to find the position ${\textbf {s}}_{k,1}$ from ${\textbf {y}}_{k}$ . Applying (105) in (107), the received pilot can be rewritten as follows: $\begin{equation*} {\textbf {y}}_{k} = \sum _{l=1}^{L_{k}} \sqrt {P}\beta _{k,l} {\textbf {F}}\left ({{\textbf {a}}\left ({{\textbf {s}}_{k,l}}\right) \odot {\textbf {p}}\left ({\Phi _{k,l}}\right)}\right) + {\textbf {n}}_{k}.\tag{108}\end{equation*}$ View Source We now assume that $\Phi _{k,1}$ has been successfully identified. Then, the codebook for matching can be defined as $\bar {\textbf {c}}(\bar {x},\bar {z})={\textbf {F}}({\textbf {a}}(\bar {x},\bar {z}) \odot {\textbf {p}}(\Phi _{k,1}))$ , where $(\bar {x},\bar {z})$ are in (100), or $\bar {\textbf {c}}(\bar {D},\bar {\theta })={\textbf {F}}({\textbf {a}}(\bar {D}\sin \bar {\theta }, \bar {D}\cos \bar {\theta }) \odot {\textbf {p}}(\Phi _{k,1}))$ , where $(\bar {D},\bar {\theta })$ are in (103). Utilizing the directionality, we obtain $\begin{equation*} \left ({\hat {x}_{k,1},\hat {z}_{k,1}}\right) = \arg \max _{\left ({\bar {x},\bar {z}}\right)\in (100) } \frac {\bar {\textbf {c}}\left ({\bar {x},\bar {z}}\right)^{H} {\textbf {y}}_{k}}{\|\bar {\textbf {c}}\left ({\bar {x},\bar {z}}\right)\|}\tag{109}\end{equation*}$ View Source or $\begin{equation*} \left ({\hat {D}_{k,1},\hat {\theta }_{k,1}}\right) = \arg \max _{\left ({\bar {D},\bar {\theta }}\right)\in (103) } \frac {\bar {\textbf {c}}\left ({\left ({\bar {D},\bar {\theta }}\right)^{H} }\right) {\textbf {y}}_{k}}{\|\bar {\textbf {c}}\left ({\bar {D},\bar {\theta }}\right)\|}\tag{110}\end{equation*}$ View Source and the localization result is $\hat {\textbf {s}}_{k,l}=(\hat {x}_{k,1},\hat {z}_{k,1})$ or $(\hat {D}_{k,1}\sin {\hat {\theta }_{k,1}},\hat {D}_{k,1}\cos {\hat {\theta }_{k,1}})$ . This localization method can work well when $T \ll ({N}/{N_{\textrm {RF}}})$ .

For sensing, given the estimates of position and VR, we can generally decide where the obstacle is. With more paths interacting with a common obstacle, the localization, size, and even shape of the obstacle can be more accurately determined from the positions and VRs of these paths. Then, the environment can be identified.

2) Channel Estimation Based on Sparsity:

In practical environments, when the system works in higher frequency bands, the NLoS paths becomes fewer due to the severe pathloss and blockage. In an extra large-aperture array system, we usually have $L_{k}\ll N$ . Therefore, the large dimensional channel ${\textbf {h}}_{k}$ can be expressed by a limited amount of paths. In the Cartesian or polar domain, most of the channel power is concentrated on nearly $L_{k}$ entries. Based on whether the orthogonality holds among the rows of transformation matrix, there are two categories of low-cost channel estimation methods. One is channel reconstruction, and the other is compressed sensing. Channel reconstruction focuses on the estimation of the limited amount of path parameters instead of the large-dimensional channel [39]. The parameters to be estimated include $\beta _{k,l}$ , ${\textbf {s}}_{k,l}$ , and $\Phi _{k,l}$ . When an LoS path exists, the user position can be obtained by the above matching method in (109) or (110). If the LoS component $\sqrt {P}\beta _{k,1} {\textbf {F}}({\textbf {a}}({\textbf {s}}_{k,1}) \odot {\textbf {p}}(\Phi _{k,1}))$ is extracted from ${\textbf {y}}_{k}$ in (108), then the second largest path component can be extracted from the residual of ${\textbf {y}}_{k}$ through the same matching method. The $L_{k}$ paths can be iteratively extracted from their mixture. Finally, the large-dimensional channel ${\textbf {h}}_{k}$ can be reconstructed by applying the estimates of $\beta _{k,l}$ , ${\textbf {s}}_{k,l}$ and $\Phi _{k,l}$ into (105). The training overhead of channel reconstruction is comparable to that of localization based on directionality.

Compressed sensing aims to estimate the reduced dimensional sparse channel in a transformation domain. The precondition is that the row vectors of the transformation matrix maintain the orthogonality between them, which can be achieved by the polar domain transformation in [29]. For $\tilde {\textbf {h}}_{P,k}\in \mathbb {C}^{N_{P}\times 1}$ , we denote the indices of its nonzero entries as $\Upsilon _{k}=\{n_{k,1},\ldots,n_{k,\tilde {N}_{k}}\}, \tilde {N}_{k}\ll N_{P}$ . Then, the reduced dimension subchannel $[\tilde {\textbf {h}}_{P,k}]_{\Upsilon _{k}}$ contains almost all the information in ${\textbf {h}}_{k}$ . When ${\textbf {U}}_{P}$ and $[{\textbf {U}}_{P}]_{\Upsilon _{k},:}$ have full ranks, (107) can be further written as follows: $\begin{equation*} {\textbf {y}}_{k} = \sqrt {P} {\textbf {F}} {\textbf {U}}_{P}^{\dagger } \tilde {\textbf {h}}_{P,k} + {\textbf {n}}_{k}\approx \sqrt {P} {\textbf {F}} \left [{{\textbf {U}}_{P}}\right]_{\Upsilon _{k},:}^{\dagger } \left [{\tilde {\textbf {h}}_{P,k}}\right]_{\Upsilon _{k}} + {\textbf {n}}_{k}.\tag{111}\end{equation*}$ View Source Then, the objective becomes to estimate the reduced dimensional channel $[\tilde {\textbf {h}}_{P,k}]_{\Upsilon _{k}}$ , which can be realized through compressed sensing. The key point lies in the identification of $\Upsilon _{k}$ from $\{1,\ldots,N_{P}\}$ . Following the compressed sensing-based channel estimation methods in millimeter wave hybrid beamforming systems, the estimates $\hat \Upsilon _{k}$ and $[\hat {\tilde {\textbf {h}}}_{P,k}]_{\hat \Upsilon _{k}}$ can be estimated through the orthogonal matching pursuing (OMP) algorithm, where the matching step is the same as (109). Then, the large-dimensional channel can be obtained by $\begin{equation*} \hat {\textbf {h}}_{k} = \left [{{\textbf {U}}_{P}}\right]_{\hat \Upsilon _{k},:} \left [{\hat {\tilde {\textbf {h}}}_{P,k}}\right]_{\hat \Upsilon _{k}}.\tag{112}\end{equation*}$ View Source Notably, since the sampling grid cannot cover the entire space, there is a high probability that the positions estimated by OMP are not the real positions, and a further refinement of the estimated positions toward the real positions is required [29] if localization needs to be achieved simultaneously.

3) Multiuser Pilot Transmission Based on Sparsity:

The nonoverlapping sparsity of different users’ antenna-domain channels enables the simultaneous transmission of pilots from or to these users. A common pilot sequence can be shared among users that have nonoverlapping VRs, and the orthogonal pilot sequences are assigned to users with overlapping VRs. Due to the limited amount of orthogonal pilot sequences, the nonoverlapping sparsity among users creates potential for the reduction of the overall training time. By knowing the VR of user $k$ , i.e., $\Phi _{{\textrm {UA}},k}$ , the BS directly transmits or receives the pilot of user $k$ through $\Phi _{{\textrm {UA}},k}$ . For instance, suppose $\Phi _{{\textrm {UA}},1},\ldots,\Phi _{{\textrm {UA}},B}$ cover subarrays $1,\ldots,B$ , respectively, and they are nonoverlapped with each other. While $\Phi _{{\textrm {UA}},B+1}$ covers the entire array. Then, pilot sequences ${\textbf {x}}_{1}$ and ${\textbf {x}}_{2}$ are assigned to users $1,\ldots,B$ and user $B+1$ , respectively. In the uplink, the received pilots at the BS from all users can be expressed as follows: $\begin{equation*} {\textbf {Y}} = \sqrt {P}{\textbf {F}}_{{\textrm {RF}}}\left ({\sum _{b=1}^{B} {\textbf {h}}_{b} {\textbf {x}}_{1}^{H} + {\textbf {h}}_{B+1} {\textbf {x}}_{2}^{H}}\right) + {\textbf {F}}_{{\textrm {RF}}}{\textbf {N}}.\tag{113}\end{equation*}$ View Source By multiplying ${\textbf {Y}}$ with ${\textbf {x}}_{1}$ , the pilots from users $1,\ldots,B$ are extracted: $\begin{equation*} {\textbf {y}} = \sqrt {P} {\textbf {F}}_{{\textrm {RF}}} \sum _{b=1}^{B} {\textbf {h}}_{b} + {\textbf {n}}\tag{114}\end{equation*}$ View Source where ${\textbf {y}}=[y_{1},\ldots,y_{B}]^{T}$ contains the received pilot on each subarray, and ${\textbf {n}}=[n_{1},\ldots,n_{B}]^{T}={\textbf {F}}_{{\textrm {RF}}}{\textbf {N}}{\textbf {x}}_{1}$ . By further recalling (88) and (104), we can rewrite $y_{b}$ as follows: $\begin{equation*} y_{b} = \sqrt {P}{\textbf {f}}_{{\textrm {RF}},b}^{T} {\textbf {h}}_{b,VR} + n_{b}\tag{115}\end{equation*}$ View Source which involves only the channel of user $b$ . That is to say, only two instead of $B+1$ orthogonal pilot sequences are required for multiuser training without introducing interference among them. The nonoverlapping sparsity in the antenna domain has been utilized in [45], where the overhead for random access was greatly reduced and the efficiency was enhanced.

In extra large-aperture RIS-assisted systems, directionality, and channel sparsity still hold in the angular, Cartesian, polar, and RIS unit domains at the RIS side. Therefore, the low-cost designs are also applicable in RIS-assisted systems. Notably, when applying the multiuser pilot transmission scheme, the RIS should be equipped with signal reception capabilities.

SECTION VII.

Conclusion

We investigated the new channel properties of spatial nonstationarity, including the spherical wave propagation and the VR, and made a survey about existing works in the context of hardware cost, processing and computation complexity, and training overhead for extra large-scale MIMO systems. We also studied the origins of spatial nonstationarity and illustrated the modifications of channel modeling when spatial nonstationarity was considered. This new property paves the way for low-cost hardware architectures. Through a detailed comparison, we proposed a double-layer architecture and the RIS as the most promising implementation architecture of an extra large-aperture array. Then, the complexity reduction problem was investigated and the distributed solution with one CPU and multiple LPUs demonstrated the most promising potential. Finally, the low-overhead communication and sensing strategies were investigated, which can be realized given the directionality and sparsity of the channel in the Cartesian, polar, and antenna domains. Summarizing, this article reviewed the early stage research efforts of extra large-scale MIMO, and highlighted the importance of low-cost designs in future practical implementations.

References is not available for this document.

Toward Extra Large-Scale MIMO: New Channel Properties and Low-Cost Designs

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

Introduction

Spherical Wave

A. Channel Response on Point

1) Channel Response Model 1:

2) Channel Response Model 2:

3) Channel Response Model 3:

B. Channel of Antenna

1) Case 1:

2) Case 2:

3) Case 3:

C. Field Partition of Antenna

1) Rayleigh/Fraunhofer Distance:

2) Lower Bound of Fresnel Region:

D. Field Partition of Array

1) Rayleigh/Fraunhofer Distance:

2) Lower Bound of Fresnel Region:

E. Modeling of Channel Between Source and Array

1) Channel Model 1:

2) Channel Model 2:

3) Channel Model 3:

Visibility Region

A. Origins of the VR

1) Unequal Path Loss:

2) Blockage Due to Obstacles:

B. Definition of the VR

1) VR of User w.r.t. the Array:

2) Two-Tier VRs:

C. Channel Modeling With VR

1) Channel Covariance Matrix With VR:

2) Steering Vectors With VR:

D. Spatial Nonstationarity

Low-Cost Extra Large-Aperture Array Architectures

A. Active Arrays With Less RF Chains

1) Connection Type:

2) Component Type:

3) State-of-the-Art Architectures:

a) Single-RF chain single antenna in full array selection:

b) Single-RF chain single antenna in partial array selection:

c) Single-RF chain multiple antennas in full array connection with PSs:

d) Single-RF chain multiple- antennas in full array connection with ON/OFF switches:

e) Single-RF chain multiple antennas in fixed partial array connection with PSs:

f) Single-RF chain multiple antennas in fixed partial array connection with ON/OFF switches:

g) Single-RF chain multiple antennas in dynamic partial array connection with PSs:

4) Proposed Double-Layer Architecture:

B. Reconfigurable Intelligent Surfaces

1) Fully Passive RIS:

2) Semi-Passive RISs:

Low-Complexity Processing and Computation

A. Complexity Reduction at CPU

B. Distributed Processing and Computation

1) Single Layer With LPUs:

2) Double Layers With CPU and LPUs:

Low-Overhead Communication and Sensing

A. Directionality and Channel Sparsity

1) Angular Domain:

2) Cartesian Domain:

3) Polar Domain:

4) Antenna Domain:

B. Low-Overhead Design

1) Localization Based on Directionality:

2) Channel Estimation Based on Sparsity:

3) Multiuser Pilot Transmission Based on Sparsity:

Conclusion

References