On Tracking the Physicality of Wi-Fi: A Subspace Approach

FIGURE 2.

Good wireless SNR does not necessarily translate into good sensing sensitivity. for sensing, there is more to designating signal and noise subspaces than meets the eye. (a) Strong reflection. (b) Weak reflection.

For each transmitter-receiver pair, the superposition of multipaths in the time domain is described by a $N_{sc}$ -dimensional frequency-domain CSI $H$ corresponding to a sampling of OFDM subcarriers across the bandwidth.²

As such, the transmitted signal $X$ can be related to the received signal $Y$ through this input-output channel response relationship according to $Y = H X$ . A MIMO system generalizes this input-output relationship for $N_{tx}$ transmitters and $N_{rx}$ receivers. For instance, if we have 3 transmitters and 3 receivers, the channel is described as a $3 \times 3 \times 30$ tensor.

We ask some basic questions:

How can we characterize the human modulated subspace of the channel?
How do the dimensionality and direction of the subspace vary in time as a result of human movement?

We take a first step towards providing a formal treatment of these key questions, and present a semi-analytical analysis of the projected signal subspace.

We discuss the theoretical underpinnings of our approach in Section III-B, particularly with a view towards contrasting to seminal prior work in Wi-Fi sensing. We then study data on uncontrolled human movement in Section III-D

B. Background on Subspace Tracking for Wireless Signals

In classic signal processing, estimating the relevant subspace of variation in data is a basic building block of a data processing pipeline [15], [16]. In the context of an indoor wireless channel, the human modulated portion of the correlation data (cf. Equation (3)) is unknown with complex temporal dynamics.

Wang et al. [5] obtain good sensing results using an ad hoc pipeline starting with the full wideband covariance matrix (see [6]). We believe that this choice necessitates the use of excessive time-averaging of the CSI data. Furthermore, the resulting signal subspace is not easy to interpret. In contrast, the Rx, Tx and Dy correlations defined in Equation (3) are interpretable low dimensional representations. Despite the pioneering sensing approach, two drawbacks come to mind:

the spatial and temporal behavior of the channel are not easily exposed, and
the temporally highly averaged subspaces are less reactive to human activities.

The good sensing results aside, the approach of [5] does not conform to wireless theory, according to which human modulation should be quantifiable using subspace tracking. Correlative MIMO subspace-based channel models have been shown to estimate capacity [6]–[8], and therefore the physicality of the medium. Intuitively, a model able to conform with a universal information-theoretic measure such as capacity is bound to convey fundamental information about the state of the channel irrespective of what modulates the channel. Further, recent theoretical results suggest that the rate of change of a MIMO OFDM channel can be inferred from the statistical analysis of its first and last eigenvectors [9], which can be viewed as canonical representatives of the signal and noise subspaces, respectively.

To elaborate on the dynamic nature of the signal subspace, consider a multipath component whose phase adds destructively to a main cluster of multipath, as depicted in Figure 2a. If the single multipath were to be shadowed as a result of a transient movement as in Figure 2b, it is clear that SNR would increase momentarily commensurate with the gain in total multipath arrivals energy. However, the sensing scene could have further nuances that are not captured by this simple SNR enhancement. As a further thought experiment, let the single multipath component be probing of a spatial sector in the environment in which a physical activity is unfolding—denoted by a spiral in Figure 2. That is, the single multipath component disproportionately delivers added movement sensitivity over that delivered by the main cluster of multipath. Despite the transient shadowing effect resulting in a boost in SNR, the instantaneous combined channel response is rendered less sensitive to activities occurring in the aforementioned spatial sector. The reduced motion modulation is manifested in reduced correlation structure in the regions of covariance matrix. Consequently—and perhaps counter-intuitively given the SNR gain—the signal subspace would necessarily “shrink” and noise subspace would “expand” momentarily. Therefore, robust sensing requires that the signal and noise subspaces be tracked explicitly in order to account for nuanced instantaneous channel effects.

The above contrived discussion suggests that a sensing system is required to adapt to dynamic channel effects in order to sustain optimal performance. Until provision for such adaptation is made in CSI-based sensing systems, we argue that models will fall short at being generalizable with guaranteed performance bounds irrespective of the nuances encountered in real-world deployment environments.

Stationarity Period

The evolution of the signal subspace can be monitored at different granularities depending on the end-user application. An example of this scenario may be seen in activity recognition applications. Activity recognition requires deriving channel signatures of sufficient discriminatory power as to allow for the unambiguous separation of activities potentially similar in their broad nature e.g. walking versus running. The stationarity period is affected by, besides the application, the sensor configurations such as sampling rate.

For example, while 25ms may be necessary for responsive activity recognition applications, a 100ms or more may suffice for the much coarser presence detection. Note that sensing models may also be possible to realize even with “aliased” channel statistics akin to compressive sensing. However, we will not discuss this further here.

C. Sensing Complexity

The trade-off between sensing sensitivity and generality is a key question when it comes to designing any data processing pipeline. Generality implies flexibility for applying techniques from one sensing application to another. Sensitivity refers to optimality for a fixed sensing task. These are affected mainly by

sensing pipeline configuration alongside its parameters, and
the dimensionality of the signal subspace of the data, as it travels through the pipeline.

The latter is of particular importance because the size of the signal subspace allows for a controlled grading of sensing sophistication from the simplest (i.e. a one-dimensional subspace) to the most general (i.e. the entire signal subspace). The simplest extreme is particularly useful when out-of-the-box flexibility and ease of realization are desirable. When optimal performance and sensitivity are required, more elaborate and intricate sensing models can be used on a larger portion of the signal subspace.

We next shed light on the complexity of the human-modulated Wi-Fi signal subspace by way of an empirical study. The aim is to establish that there is more to designating signal and noise subspaces than meets the eye. Future research ought to take this complexity into consideration if Wi-Fi sensing were to be transitioned from controlled setups and into the wild.

D. Empirical Study of Uncontrolled Human Movement

We proceed to study the statistical effects of human activities on the channel covariances. Specifically, we study the projected signal subspaces, our putative proxies for the signal subspace for human modulation. To this end, we first quantify the information about the physical environment contained in the covariance matrix. This information is dynamic in nature and needs to be quantified instantaneously. One approach to gauging the information content in a series of covariances is to monitor the distortion contributed by the constituent eigenvectors. That is, by successively nulling the respective eigenvectors and measuring the fidelity of the covariance matrix reconstruction, we can quantify in the mean squared error-sense (MSE) the signal and noise boundaries at a given target distortion level (e.g. −12dB).

Concretely, let $\mathbf {R} = \mathbf {U} \boldsymbol {\Delta } \mathbf {U}^{H}$ be the eigendecomposition of one of the channel correlation matrices in equation (4). Define $\mathbf {R}_{i}^\prime = \mathbf {U} \boldsymbol {\Delta }_{i}^\prime \mathbf {U}^{H}$ as a reconstructed channel correlation matrix whose modified diagonal eigenvalue matrix $\boldsymbol {\Delta }_{i}^\prime$ nullifies all diagonal entries beyond index $i$ i.e. $\begin{equation*} \boldsymbol {\Delta }^{\prime }_{i} = \begin{bmatrix} \ddots & \quad 0 & \quad 0 & \quad \ldots & \quad \ldots & \quad 0 \\ 0 & \quad \delta _{i-1} & \quad 0 & \quad 0 & \quad \ldots & \quad 0 \\ 0 & \quad 0 & \quad \delta _{i} & \quad \ddots & \quad & \quad \vdots \\ \vdots & \quad & \quad \ddots & \quad 0 & \quad \ddots & \quad \vdots \\ \vdots & \quad & \quad & \quad \ddots & \quad \ddots & \quad \vdots \\ 0 & \quad 0 & \quad \ldots & \quad \ldots & \quad \ldots & \quad 0 \end{bmatrix}\tag{10}\end{equation*}$ View Source The reconstruction error matrix is $\mathbf {R}^\epsilon = \mathbf {R} - \mathbf {R}_{i}^\prime = {[r_{kl}^\epsilon]}$ . The reconstruction MSE error can then be described as $\begin{equation*} \mathrm {MSE} = \frac {1}{N^{2}} {\sum _{k,l} (r^{\epsilon }_{kl})^{2}}\tag{11}\end{equation*}$ View Source

The above MSE search allows us to build a time-series picture of the dynamic partitioning of the covariance into signal and noise subspaces. This evolution of signal and noise subspaces is indicative of the evolution in the corresponding propagation conditions and also necessarily human movement. Intuitively, the harsher the dynamics of wireless propagation conditions, the more fluctuating the boundary between signal and noise subspaces is.

Having arrived at a statistical picture of subspaces boundary, we can utilize this knowledge to examine how the fractional signal subspace energy changes throughout human movement. We define the fractional signal subspace energy as the ratio between energy in the signal subspace to total energy contained in the channel. Thus, the fractional energy can be written as $E_{s} = \mathop {\mathrm {Tr}}\nolimits (\boldsymbol {\Delta }^{s}_{\text {x}})/ \mathop {\mathrm {Tr}}\nolimits (\boldsymbol {\Delta }_{\text {x}})$ , where $\mathop {\mathrm {Tr}}\nolimits$ is the trace operator, and $\boldsymbol {\Delta }$ is the unitary eigenvalue matrix (cf. Equation (9)), and $\text {x} \in [\text {Rx}, \text {Tx}, \text {Dy}]$ . As such, $E_{s}$ conveys information about optimum sensing SNR dynamics. A parsimonious suboptimal sensing system that utilizes instantaneously less of the available $E_{s}[k]$ at the $k$ th time is effectively throwing away information.

The following discourse considers uncontrolled indoor human movement. This is perhaps the most generic form of activities likely to occur indoors. Naturally, uncoordinated motion components superimpose to modulate the signal subspace in random ways. Stronger motion components could also mask much weaker ones.

We begin by examining what effect increased human movements has on the signal and noise subspaces. We conduct an experiment in which participants were asked to walk randomly in a room. The number of moving people present was varied from 0 (i.e. empty) to 8. The duration of movement per session was 5 to 10 minutes. An 802.11n $3\times 3$ MIMO transmitter node was placed outside the room and a receiver node was placed inside. The CSI was sampled at a nominal sampling rate of 500Hz using a 5GHz carrier and 40MHz channel bandwidth. The reported CSI is 30 dimensional for each transmitter-receiver pair sampling the available 40MHz bandwidth coarsely but equidistantly. That is, 1-in-4 OFDM subcarriers are reported, resulting in a measured MIMO CSI $3 \times 3 \times 30$ tensor.

We investigate the effect of increased human movement on signal and noise subspaces by way of searching for the subspaces boundary yielding an objective MSE distortion as outlined earlier. Eigenvectors contributing less to the fidelity of covariance reconstruction will fall within the noise subspace. Conversely, eigenvectors impacting the fidelity of reconstruction more pronouncedly belongs to the signal subspace. The MSE-guided search finds the subspaces boundary that satisfies a desired distortion level in the MSE sense. Owing to the finite subspace resolution of a practical system, we interpolate between two MSE distortion levels produced at adjacent eigenvectors in order to simulate the effect of a smoothly varying MSE distortion and its respective “fractional” subspace index.

Figure 3a illustrates the variability in the fractional subspace energy $E_{s}$ within the signal subspace extent and across movement scenarios—as denoted by the vertical scatter points. It is evident that the variability increases towards the lower-end of the subspace extent, reflecting the poor SNR contributed. Further, the variability increases markedly with the number of moving people i.e. fractional energy is more diffused in higher occupancy classes. Figure 3c shows the result of the MSE search procedure on the demarcation of the boundary between the signal and noise subspaces. Note, however, the statistical variability corroborating the earlier hypothesis; namely, that dynamic stresses on the wireless channel would result in equivalent shrinkage or expansion of the signal subspace as needed to satisfy the target reconstruction distortion level. Similar subspace dynamic behavior can be seen when doubling the objective MSE distortion in figure 3b the subspaces boundary demarcation is insensitive to the chosen MSE level. It is further interesting to observe the accompanied effects in figure 3d on the fractional energy at the very same instantaneous demarcations of the signal and noise boundary established by the MSE search. The fractional energy at the true³ instantaneous subspaces boundary is unable to provide a faithful statistical account on the expansion/shrinkage of the signal subspace at least for scenarios 2, 4, and 6 as evident by their density overlap. That is, the fractional energy cannot be called upon to optimally partition the covariance matrix.

$FIGURE 3. - Characterizing signal and noise subspaces through an MSE search procedure for 5 uncontrolled human movement scenarios. The numbers in legend 0, 2, 4, 6, and 8 denote how many moving people are present. (a) Fractional energy $E_{s}$ evaluated across the signal dimensionality. (b) MSE-based subspace boundary at −6 dB reconstruction objective. (c) MSE-based subspace boundary at −12 dB reconstruction objective. (d) Fractional energy at subspace boundary.$

FIGURE 3.

Characterizing signal and noise subspaces through an MSE search procedure for 5 uncontrolled human movement scenarios. The numbers in legend 0, 2, 4, 6, and 8 denote how many moving people are present. (a) Fractional energy $E_{s}$ evaluated across the signal dimensionality. (b) MSE-based subspace boundary at −6 dB reconstruction objective. (c) MSE-based subspace boundary at −12 dB reconstruction objective. (d) Fractional energy at subspace boundary.

We conclude this section by qualifying our MSE-search methodology using mutual information (MI). The instantaneous subspaces boundary is used to agglomerate series of reconstructions of the covariance matrix as to compare against the groundtruth covariance distribution. We sweep the objective MSE distortion between −24 dB and −3 dB in 3 dB increments. We then measure the normalized mutual information between $V_{\text {Dy}}[k]$ and $R_{\text {Dy}}[k]$ for different occupancy cases as illustrated in figure 4a. Figure 4b shows that, for all human presence scenarios, the normalized MI at the instantaneous subspaces boundary steadily approaches unity as MSE reconstruction fidelity increases towards −24 dB. We observe that in terms of mutual information, our MSE-reconstruction based methodology is consistent under different channel conditions. In order to corroborate this observation, we compute the same normalized mutual information metric for the static (i.e. truncated) subspace extent across occupancy cases. Figure 4c depicts such MI between $R_{\text {Dy}}[k]$ on the one hand, and $V_{0} \subset V_{1} \subset \cdots \subset V = \mathbb {C}^{N}$ on the other hand.⁴ A “waterfall” effect can be seen whereby more truncated static subspace is needed at higher occupancy classes in order for MI to approach unity. Such MI waterfall effect is equivalent to the MSE-based subspace boundary shown earlier in figures 3b & 3c, reaffirming the notion of instantaneous subspace expansion and shrinkage as a function of the intensity of human movements.

FIGURE 4.

Normalized mutual information between covariances and their imperfect reconstructions over time and across 5 uncontrolled human presence scenarios, highlighting that the signal subspace is dynamic in nature. Legend denotes how many moving people are present. (a) Instantaneous subspace boundary. (b) Normalized MI versus MSE reconstruction objective. (c) Normalized MI versus subspace extent.

SECTION IV.

Subspace Tracking

In Section III, we established and characterised the notions of signal and noise subspaces within the context of human-induced channel perturbations. We now turn to examples of how to derive features for sensing tasks. Our approach is to track the evolution of the projected signal subspaces (cf. Section II-D).

The subspace-based human sensing we advocate for is in line with foundational work in wireless channels [6]–[9], which is in contrast to prior work on wireless Wi-Fi sensing (see [5]). We show that with a good enough instantaneous estimate of the covariances described in Section II this tracking can be used to capture the effects of human modulation. We present our analysis of the Dy-projected signal subspace, but the same can be easily repeated for the Rx dimension.

A. A Geometric View of Subspace Evolution

As an example of subspace tracking, we present the trajectories of the eigenvectors of the covariance matrices (cf. Equation (3)).

Consider the time evolution of subspaces spanned by the first two (unnormalized) eigenvectors $\delta _{0}[k]$ and $\delta _{1}[k]$ of $\mathbf {R}_{\text {Dy}}[k]$ . Let $\mathbb {S}_{0}$ and $\mathbb {S}_{1}$ be subspaces spanned by $\delta _{0}[k]$ and $\delta _{1}[k]$ for $k=1,2,3,\ldots$ —depicted in local coordinate systems—respectively. See figure 5 for a geometric interpretation.

FIGURE 5.

Geometric interpretation of subspace evolution. (a) Subspace component 0. (b) Subspace component 1.

That is, recalling equation (9), each of these subspace components at the $k$ th discrete time would correspond to (1) an $i$ th eigenvector $\mathbf {u}_{{\text {Dy}},i}[k] \in \mathbf {U}^{s}_{\text {Dy}}[k]$ (i.e. belonging to the signal subspace), and (2) a scaling eigenvalue $\delta _{i}[k] \in \hat {\boldsymbol {\Delta }}^{s}_{\text {Dy}}[k]$ . The empirical signal and noise characterization study reported in Section III has concluded that the fractional energy $E_{s}$ evaluated at the subspaces boundary is less able to reveal increased multi-user channel variations. That is, when considering the movement of the signal subspace as a result of human-induced channel stresses, less stock should be put in the eigenvalues $\delta$ ’s. This is also intuitive to communications practitioners because phase-modulation, when combined with amplitude-modulation, is what really allows for packing more information efficiently within a finite stretch of bandwidth. The equivalence to the unreliability of power (i.e. eigenvalues) has also been echoed in prior art; namely, that “wireless internal state transitions result in high amplitude impulse and burst noises in CSI streams” [5]. As an example of this noisy state transition, note the bimodal nature of the 0 occupancy density of the fractional energy $E_{s}$ in figure 3d—as indicated by the transparent underlaying behind the 8 occupancy case density. Another example is recent work on channel charting in the context of urban CSI measurements from basestations wherein Studer et al. [17] propose CSI scaling part of their feature mapping procedure.

Therefore, referring to figure 5 again, a critical insight emerges: human effects on the wireless channel can be “demodulated” by observing the corresponding angular movements of the signal subspace.

B. Differential Subspace Evolution

The time dependency of the angular movements of the subspace is visualized in figure 5. The (complex) angles, which can be computed as the real part of Hermitian inner product, $\psi [{1}] = \angle (\mathbf {u}_{\text {Dy},i}[{0}], \mathbf {u}_{{\text {Dy}},i}[{1}])$ , ${\dots }$ , and $\psi [{3}] = \angle (\mathbf {u}_{{\text {Dy}},i}[{2}], \mathbf {u}_{{\text {Dy}},i}[{3}])$ are depicted for both subspace component 0 and 1.

These angles signify the differential movement of a certain signal subspace component between the $k-1$ and $k$ discrete times. Incidentally, these angles have also another interpretation. Note that the diagonalization of the covariance matrix of equation (9) will produce eigenvectors which are by construction unitary i.e. $\mathbf {u}_{\text {Dy},i}^{H} \,\,\mathbf {u}_{\text {Dy},i} = 1 = \cos (0)$ . However, a human movement will cause the channel’s signal subspace to evolve out of its “rest” condition. The resultant deviation in the subspace will be manifested in equivalent deviation in the unitarity of its constituent, evolved eigenvectors w.r.t. their original “rest” conditions. Thus, the successive change in unitarity for the $i$ th subspace component between time $k-1$ and $k$ is quantified by $\mathbf {u}^{H}_{\text {Dy},i}[k] \,\,\mathbf {u}_{\text {Dy},i}[k-1] = \cos (\psi [k])$ which coincides with the angular movement of the subspace. Hence we term this angular metric the differential unitarity.

In general, our proposed differential unitarity feature for tracking human-modulated signal subspaces is applicable to any channel eigendecomposition formulation commonly encountered in literature. Denote by $\mathbf {u}_{\text {Dy},i}[k]$ the $i$ th delayspread eigenvector at time $k$ . Then the differential unitarity $\hat {u}_{\text {Dy},i}[k] = \cos (\psi _{\text {Dy},i}[k])$ between time $k$ and $k-1$ is formulated as $\begin{equation*} \hat {u}_{\text {Dy},i}[k] = \mathbf {u}_{\text {Dy},i}^{H}[k] ~\mathbf {u}_{\text {Dy},i}[k-1] \tag{12}\end{equation*}$ View Source

Similarly, for the receive-side eigenbasis $\begin{equation*} \hat {u}_{\text {Rx},i}[k] = \mathbf {u}_{\text {Rx},i}^{H}[k] ~\mathbf {u}_{\text {Rx},i}[k-1] \tag{13}\end{equation*}$ View Source

Equations (12) & (13) represent two degrees of freedom through which we can measure the volatility in the wireless channel as a result of human stressors: (1) spatial from multiple antennae and (2) temporal across the delayspread (or equivalently bandwidth). We next build intuition for these complementary differential unitarity metrics by presenting a series of concrete numerical examples.

We return to the uncontrolled movement dataset reported in Section III. Further, let us examine the behavior of the differential unitarity for the 1st subspace component of both the receive-side and delayspread subspaces i.e. $\hat {u}_{\text {Rx},1}$ and $\hat {u}_{\text {Dy},1}$ , respectively. Figure 6 plots the cumulative distribution functions (CDFs) for $\hat {u}_{\text {Rx},1}$ and $\hat {u}_{\text {Dy},1}$ for all 9 occupancy cases. Specifically, note the dispersive nature of the metric in figures 6a & 6c as a function of increased human-induced channel perturbations. It is clear that the dispersion in the statistics of the magnitude of differential unitarity—corresponding to the 1st subspace components—monotonically increases, generally, with increased human movement.

$FIGURE 6. - Differential unitarity cumulative distribution functions across 9 occupancy scenarios for Rx and Dy. legend denotes how many moving people are present. (a) and (c) correspond to the successive pairwise correlations, while (b) and (d) measure the rate of range. The rate of change is referred to by the $^\prime $ operator.$

FIGURE 6.

Differential unitarity cumulative distribution functions across 9 occupancy scenarios for Rx and Dy. legend denotes how many moving people are present. (a) and (c) correspond to the successive pairwise correlations, while (b) and (d) measure the rate of range. The rate of change is referred to by the $^\prime$ operator.

The diagram in figure 7b shows that the subspace bases relating to $N$ time periods are buffered so that comparison can be made across a wider time window. Thus, for example, the eigenbases at $t=0$ can be compared with those at $t=N-1$ , the eigenbases at $t=1$ can be compared with the eigenbases at $t=N-2$ and the eigenbases at $t=2$ can be compared with the eigenbases at $t=N-3$ . Such an arrangement may enable the changes in channel statistics to be viewed across a wider time period and may enable the rate of change of eigenvector unitarity to be determined.

FIGURE 7.

Interrogating rate of change of differential unitarity by correlating farther apart eigenvectors with successively decreasing distance. (a) Pairwise. (b) Slope.

Deriving a measure of the rate of change of differential unitarity has the advantage of increasing the separation of the CDFs of figures 6a & 6c. To this end, we apply the scheme depicted in the lower diagram of figure 7 (termed “slope”) to the same experiment and obtain the CDFs shown in figures 6b & 6d. It is readily evident that the CDFs corresponding to the rate of change in the differential unitarity extracted over a window of time experience increased dispersion as a result of human occupancy. This may allow for learning and/or calibrating better discrimination boundaries in the inference logic.

1) Subspace Sampling

Recalling equation (8), we note that the expectation operator implies an averaging effect. Earlier we have elaborated on the notion of stationarity period and its connection to CSI sampling and application granularity requirements. Yet another pertinent aspect for consideration lies in how to realize the expectation. Broadly, there are two methods often employed in classic signal processing literature for updating the signal subspace: (i) stochastic approximation, and (ii) batch averaging. These two variants have implications on signal subspace tracking, which we discuss next.

An unbiased stochastic expectation estimator is given by $\begin{equation*} \mathbf {R}_{\text {x}}[k] = (1-\lambda) \sum _{n=0}^{k} \lambda ^{k-n} \mathcal {H}_{(m)}[n] \mathcal {H}^{H}_{(m)}[n] \tag{14}\end{equation*}$ View Source where $\text {x} \in [\text {Rx}, \text {Tx}, \text {Dy}]$ and $m \in [{1, 2, 3}]$ , respectively. This estimator reduces to the recursive expression $\begin{equation*} \mathbf {R}_{\text {x}}[k] = \lambda \mathbf {R}_{\text {x}}[k-1] + (1-\lambda) \mathcal {H}_{(m)}[k] \mathcal {H}^{H}_{(m)}[k] \tag{15}\end{equation*}$ View Source where $\lambda \in {[0, 1)}$ is a forgetting factor often chosen close to 1. This stochastic estimator accounts for a long channel history, albeit while de-emphasizing far away events. Such subspace update tends to “dampen” the effect of abrupt channel changes on the signal subspace. Alternatively, these abrupt changes can also be preserved and admitted into the subspace using the sliding window (a.k.a batch) approach given by $\begin{equation*} \mathbf {R}_{\text {x}}[k] = \frac {1}{L} \sum _{n=k-L+1}^{k} \mathcal {H}_{(m)}[n] \mathcal {H}^{H}_{(m)}[n] \tag{16}\end{equation*}$ View Source where $L$ is the window size determined by the assumed stationarity period.

We now compare and contrast between these two subspace update variants. An activity recognition dataset available publicly is used [1]. The dataset is comprised of 6 single-user activities; namely, standing up, sitting down, lying down, falling, walking, and running. SIMO CSI data from three receivers is sampled at 1 ksps rate. For added tracking responsiveness and resolution, we choose a stationarity period of 25ms and proceed to update the covariance matrix with 95% CSI overlap from previous stationarity period. This results in around 800 Hz subspace update rate.

In Figure 8, we perform time-frequency localization on the pairwise differential unitarity subspace tracking metric. The localization uses a window of 1.28 seconds with 95% content overlap between two windows for finer time-frequency resolution. In the interest of space, only four single-user activities are shown corresponding to falling, lying down, walking, and running. The spectrograms of the upper row of Figure 8 were generated using the batch subspace update variant; while those in the lower row utilized the stochastic variant with a forgetting factor $\lambda = 0.99$ . The color coding of the spectrograms in each row was group-harmonized in order to convey correct information about the differential intensity of the time-frequency bins across activities. We therefore safely omit the color maps from the spectrograms. As touched upon previously, the batch subspace update is more responsive to background disturbances in the channel and would admit these into signal subspace. We can readily observe more background variations across all activities in the upper series of spectrograms. Despite this, we can still see distinctly individual behavior across these activities—falling being the most concentrated in time-frequency and running being the most dispersed. However, it is interesting to see how the stochastic update was able to filter out much of the background channel disturbances while preserving the discriminative features of the four activities; namely, the increased time-frequency dispersion from falling, lying down, through to walking and running—again the latter being the most dispersed.

FIGURE 8.

Spectrograms for 4 activities (falling, lying down, walking, and running) using two subspace update variants: batch and stochastic. It is readily that stochastic has a filtering effect on the time-frequency localization of pairwise differential unitarity subspace tracking metric. (a) Falling – batch. (b) Lying down – batch. (c) Walking – batch. (d) Running – batch. (e) Falling – stochastic. (f) Lying down – stochastic. (g) Walking – stochastic. (h) Running – stochastic.

We have opted to conduct time-frequency localization on the pairwise subspace tracker owing to its more intuitive association with speed i.e. 1st-order derivative of subspace evolution. A justification for the correspondence between the rate of change in CSI and speed can be found in [5]. Our 1st-order differentiation of the subspace can be viewed as a generalized fusion method for extracting information embedded in all subcarriers simultaneously. This fusion is a data-level fusion, rather than feature-level approaches involving ad hoc subcarrier selection strategies [4]. Some reported Wi-Fi sensing systems resort to selecting subcarriers of better SNR since frequency selectivity of wideband Wi-Fi channels causes some subcarriers to fall within the channel nulls—with obvious consequences for their reliability. Our subspace approach systematically fuses information contained in all subcarriers without the need to perform preconditioning. However, unlike the PCA-based approach [5], this fusion is principled, interpretable, and has its roots in formal wireless channel concepts [6]–[9].

As illustrated in Figure 7b, we use a robust sampling technique to obtain clean statistics from the differential unitarity measurements. In Figure 9, we illustrate the effect of this choice. Six single-user activities—standing up, sitting down, lying down, falling, walking, and running—for the slope tracker are depicted. Quick inspection of these plots corroborate the earlier findings of the spectrograms analysis; namely, that stochastic filters high-frequency channel perturbations compared to batch. That is, stochastic tracks the envelope of the activity rather than its and/or the channel’s background high-frequency fluctuations. We have alluded to this tunable channel detail in the signal subspace, be it channel background- or activity-related, by the hat accent in Equation (9). The abrupt activity of falling has an impulse-like acceleration content, while running is the richest in such 2nd-order rate of change moments.

FIGURE 9.

Waveforms for 6 activities (standing up, sitting down, lying down, falling, walking, and running) using two subspace update variants: batch and stochastic. it is readily noticeable that stochastic has a filtering effect on the slope differential unitarity subspace tracking metric. (a) Lying down. (b) Sitting down. (c) Standing up. (d) Walking.

2) Duality

For completeness, we provide commentary on the pertinent issue of choosing a channel representation: time- versus frequency-domain. The structured model we introduced in Section II-C has been validated with empirical channel impulse response (CIR) measurements i.e. in the time-domain. Identical eigenspace formulation has been applied in the frequency-domain for CSI instead [18], and also validated with empirical capacity measurements. Since our subspace trackers are differential in nature, tracking is insensitive to the representation of the channel be it time- or frequency-domain. That said, a salient point in relation to the phase behavior of the trackers is worth making for completeness of treatment. The numerical perturbations experienced in the time-domain—as a function of human motion—differ to those experienced in the frequency-domain. Classic work on the stability of subspaces provides bounds on their trigonometric (i.e. angular) behavior as a function of technical mathematical issues ranging from eigenvalue spectral gap to numerical residuals [19].

To highlight this point, we revisit the waveform of the slope subspace tracker for the running activity depicted in Figure 9f. We perform channel decomposition through to differential unitarity calculations both for the CIR and the CSI versions of measurements (i.e. time & frequency domains). The results are shown in Figure 10. As illustrated in Figures 10a & 10b, it is intuitive to note that the differential tracker performs identically in time and frequency domains. After all, a linear operator (i.e. [I]DFT) translates between one domain to another. The occasional polarity switch in the phase of the differential tracker (Figures 10c & 10d) can be explained by the effects studied in [19]. However, it is interesting to note the increased phase instabilities when running the differential metric on top of CIR measurements over those obtained from CSI measurements. This phenomenon can be readily seen in Figures 10c & 10d. We conjecture that the sparsity in the CIR measurements (i.e. impulse-like nature) compared to the smoother CSI measurements causes numerical instabilities which give rise to added phase instabilities in the subspace. The scatter plot of Figure 10d supports this hypothesis as can be seen by the tighter clustering in the CSI case. However, further investigations are needed to fully illuminate this issue before solid conclusions can be drawn.

FIGURE 10.

Waveforms for slope tracker corresponding to the running activity. differential metric is identical in magnitude, but subspace phase stabilities exhibit interesting variations that are different depending on whether subspace decomposition is performed in the time-domain or frequency-domain. (a) Magnitude. (b) Error power. (c) Scatter.

SECTION V.

Evaluation

In what follows, we showcase how specialized occupancy and activity sensing can be built atop our featurization.

A. Occupancy Detection

1) Experimental Setup

We evaluate the performance of subspace tracking in terms of the robustness of occupancy detection. To evaluate the robustness, we investigate the accuracy of the classification model in new environments. More specifically, we trained the classification model using CSI data obtained from a certain placement and tested its accuracy on different placements.

a: Data

We collected the CSI data in 8 places and on 41 placements in total. As depicted in figure 11, the places include six rooms, one lobby, and one lounge and have different characteristics such as room layout and furniture position. We collected the CSI data while varying the number of moving people from zero to 2 (P4, P5, P6) and to 3 (the rest). Each session lasted five minutes and participants were asked to freely move during the session. Figure 11 shows room layouts and device placements. The purpose of multiple placements are to investigate what is a realistic upper-bound on the classification performance of a single device under different training and testing conditions. MIMO CSI data were sampled at a nominal 500Hz rate. A stationarity period of 50ms was used and the subspace update was performed in a sliding window fashion with no overlap as in Equation (16).

FIGURE 11.

Device placements.

b: Pipeline

For the occupancy detection, we developed an inference pipeline using a long short-term memory (LSTM) classifier. We chose LSTM as a classifier to leverage spatio-temporal variation of our differential unitarity features from subspace tracking. In the current implementation, we adopted two hidden LSTM layers, each of which has 50 nodes. Some prior presence detection work dwells on the signal much longer with distribution-based approach while using a diversity of frequency channels [20]. In contrast, we define a short 5 seconds inference window and with no channel frequency diversity. In this paper, our objective is to showcase how to specialize various subspace tracking-based applications rather than demonstrate best-in-class performance.

c: Comparison

For comparison, we implemented the baseline pipeline from [21]. It takes temporal variations of CSI data as feature values and uses linear discriminant analysis as a classifier.

d: Training and Test

For training, we selected a receiver located at a diagonal position of the transmitter, thereby maximizing the RF coverage. Accordingly, we have 11 different models. For the evaluation, we considered three environment variations, same, minor, and major. Same refers where the data from the same receiver, i.e., same placement, is used both for training and test. Minor and major use the CSI data from different receivers placed in the same room and different room, respectively. Same represents the upper bound of the performance that the inference logic can achieve in a specific environment. Minor and major show how robust the inference pipeline is in unseen environments.

2) Experimental Results

We investigate how the subspace tracking effectively mitigates the environmental effect of CSI on the occupancy detection. Figure 12a shows the box plots of the accuracy of 11 models for different variations. Although the accuracy of both pipelines is similar in same variation, the subspace tracking retains more competitive accuracy as we introduce minor and major environmental changes compared to the baseline. The accuracy in same variation is 89% and 88% for the subspace tracking and baseline, respectively. However, in minor and major variations, the subspace tracking decreases to 82% and 78%, whereas the baseline does to 73% and 62%.

FIGURE 12.

Occupancy performance. (a) Environment variations. (b) Number of classes.

We further investigate the effect of the number of classes on the occupancy detection on major variation. Figure 12b shows the box plots of the accuracy while varying the number of classes. 2 classes represent presence detection, i.e., empty or occupied. 3 and 4 classes are for the number of people as [0, 1, 2+] and [0, 1, 2, 3], respectively.⁵ The results show that the subspace tracking achieves reasonable performance even with higher number of classes. Our pipeline shows 85%, 70% and 65% for 2, 3, and 4 classes, respectively, whereas the baseline does 62%, 49%, and 43%.

B. Physical Activity

We use the activity recognition dataset available publicly by Yousefi et al. [1] to demonstrate the applicability of our subspace tracking technique on the problem domain of activity classification. The dataset is comprised of 6 single-user activities; namely, standing up, sitting down, lying down, falling, walking, and running. SIMO CSI data from three receiving multiple antennae is sampled at 1 ksps rate. We choose a stationarity period of 25ms and proceed to update the covariance matrix with 95% CSI overlap from previous stationarity period with $\lambda = 0.99$ for recursive subspace update as in Equation (15). This gives around 800 Hz subspace update rate. As illustrated previously in figure 9, recursive subspace tracking filters background channel activity and/or subspace noise. This unwanted channel activity has been alluded to in Equations (8) & (9).

In a preliminary evaluation, we build a simple classifier based around dynamic time warping (DTW) and K-nearest neighbors. This is applied to a single-dimensional Dy slope differential unitarity (see figure 7b). We evaluate our classifier against the author’s mid-range hidden Markov model (HMM) which uses a combination of PCA and the short-time Fourier transform (STFT) time-frequency localization pre-processing. The results are shown in figure 13. Capability-wise, there is an asymmetry in that featurization based around 2D STFT + HMM is in principle far stronger than our 1D DTW + K-nearest. Nonetheless, on the whole, the performance of our simple classifier is not far from that reported by Yousef et al, albeit with different characteristics. For instance, while 2D STFT + HMM outperforms our 1D DTW + K-nearest in nearly all activities, our fall activity performance is substantially better. We attribute this to the high acceleration content of fall which our slope metric is able to capture easily as shown in figure 9a due to native acceleration sensing. Perhaps our pairwise metric with 2D time-frequency localization would perform much better. Since our focus in this paper is to only showcase a generic formal featurization suited for many applications, we leave improved classification for future work.

$FIGURE 13. - Activity recognition performance. (a) Ours: $\hat {u}_{\textrm {Dy},1} +$ DTW + K-nearest neighbours. (b) Yousef et al: PCA + STFT + HMM.$

FIGURE 13.

Activity recognition performance. (a) Ours: $\hat {u}_{\textrm {Dy},1} +$ DTW + K-nearest neighbours. (b) Yousef et al: PCA + STFT + HMM.

SECTION VI.

Discussion

In this section, we provide commentary on the limitations of our work and discuss relevance to other wireless systems, thereby exposing items of future research.

A. Applicability to Other 802.11 Standards

Physical propagation behavior will differ depending on the frequency band. Such behavior will be mirrored when viewed through the lens of the signal and noise subspaces. Our proposed featurization provides sensing primitives to track the variations in propagation dynamics that are induced by human motion. However, it is the role of the machine learning (ML) component to capture such behavior in a robust sensing model. Thus, when operating within different frequency bands, it is important to ensure that the back-end ML component is trained for the respective human-modulated propagation behavior corresponding to that specific band. Our experimental results in this paper are for the 5GHz Wi-Fi band with 40MHz bandwidth. Nonetheless, other wireless standards—such as 802.11ah operating in the sub-1GHz band and 802.11ad/ay operating in the 60GHz band—could benefit from identical featurization, albeit after specializing the back-end ML component to capture their individual propagation characteristics as a function of human motion. Moreover, we have shown in Section IV that the magnitude of our differential subspace tracking behaves identically irrespective of the representation of the channel response, be it in time or frequency. This means that both the single-carrier and OFDM variants of WiGig would benefit from our subspace-based featurization. It is also worth pointing out that in relation to WiGig, 60GHz frequencies are quasi-optical and are less able to diffract around objects. The subspace will mirror this behavior; however, increased coverage of the environment may be possible by considering the beam training procedure that 802.11ad/ay implements. Specifically, recent work has shown that such beam training procedure from infrastructure access points can be used to localize a mobile user [22]. It would be interesting in this particular example to see if tracking the subspace would allow for inferring finer-grained details on the nature of the mobile node’s movement. OFDMA systems such as 802.11ax can also benefit from the proposed subspace tracking; however, care should be taken to handle instances of transition in user-assigned subcarriers and their implications on the subspace.

B. ML Model Coverage and Vectors of Variation

There are many variables that impact the robustness of the back-end ML model. We call these the vectors of variation of the ML model. Exhaustive training across these vectors of variation is needed for sufficient coverage of the sampling space in order to ensure the ML model generalizes in the real-world. One such vector of variation is that arising from the individualized way in which different users perform activities. Broadly, there are two methods in prior art for dealing with such variations: design-based and learning-based. In design-based methods, hand-crafted features by an expert designer—such as careful frequency binning in [23] and coarser wavelet spectral bins in [5]—are engineered to absorb the expected variations in the real-world. In contrast, learning-based approaches rely on automatic coverage of these natural variations by the inference component through the sheer amount of empirical data used for training. In this paper, we focused on a formal and interpretable low-dimensional featurization of the wireless channel, with our evaluation (cf. fiugre 12) falling under the latter learning-based approach.

C. Axes of Resolution

The performance of sensing applications built atop channel tracking is fundamentally limited by the spatio-temporal resolutions of channel measurements. Specifically, the utilized bandwidth and number of antennae have a large bearing on what can be perceived unambiguously in the environment i.e. without over-fitting inference. To see this, consider the environmental imaging capability of the covariance $\mathbf {R}_{\text {x}}$ through its beamspace representation $\mathbf {F} \mathbf {R}_{\text {x}} \mathbf {F}^{H}$ , where $\mathbf {F}$ is the Fourier transform matrix [6], [7], [24]. Clearly, for meaningful imaging, the number of antennae needs to be high in order to resolve environmental spatial scatterers. Similarly, bandwidth delivers the temporal resolution necessary for measuring the channel’s delayspread (or frequency selectivity) more accurately. It is customary to see in related literature prolonged signal dwell times in order to compensate for the lack of spatio-temporal resolution as supplied by current research testbeds e.g. of the order of minutes dwell time to estimate occupancy in [3] and [20]. To put it in wireless terms, clearly the “coherence” time of crowd movement indoors is much shorter than 5 minutes. We, therefore, would argue that practical indoor channel sensing systems are likely to appear once we begin to see the roll-out of wireless infrastructure of enhanced spatio-temporal resolutions such as indoor massive MIMO in the millimeter-wave band.

SECTION VII.

Conclusion

In this paper, we formalize the problem of Wi-Fi-based human sensing and cast it as a channel signal subspace tracking task. We demonstrate the equivalence of the two problems. We posit the optimality of such formulation citing prior established work from wireless literature. We conclude by providing evidence for the applicability of our subspace tracking across two usage scenarios: presence detection and activity recognition with promising early results. Future work will focus on machine learning classification using our subspace-based featurization.

ACKNOWLEDGMENT

The authors would like to thank Howard Huang for the helpful comments on this manuscript.

Cites in Papers - |

Cites in Papers - IEEE (4)

Select All

Sérgio Ivan Lopes, Fábio Silva, Pedro Pinho, Paulo Marques, Carlos Abreu, João Milheiro, Bruno Braga, Gabriel Queirós, Rita Almeida, Nuno Borges Carvalho, "CoViS: A Contactless Health Monitoring System for the Nursing Home", IEEE Access, vol.12, pp.20802-20821, 2024.

Kamran Ali, Mohammed Alloulah, Fahim Kawsar, Alex X. Liu, "On Goodness of WiFi Based Monitoring of Sleep Vital Signs in the Wild", IEEE Transactions on Mobile Computing, vol.22, no.1, pp.341-355, 2023.

Julham, Muharman Lubis, Arif Ridho Lubis, "Automatic Switch Prototype without Touch Assistance Wireless Fidelity (Wi-Fi) Signal", 2020 4rd International Conference on Electrical, Telecommunication and Computer Engineering (ELTICOM), pp.136-139, 2020.

Jian-Qiang Lin, Shing-Chow Chan, Hai-Jun Tan, "A Variable Regularized Recursive Subspace Model Identification Algorithm With Extended Instrumental Variable and Variable Forgetting Factor", IEEE Access, vol.8, pp.43520-43536, 2020.