Loading [MathJax]/extensions/TeX/boldsymbol.js
A Variable Regularized Recursive Subspace Model Identification Algorithm With Extended Instrumental Variable and Variable Forgetting Factor | IEEE Journals & Magazine | IEEE Xplore

A Variable Regularized Recursive Subspace Model Identification Algorithm With Extended Instrumental Variable and Variable Forgetting Factor


Recursive Subspace Model Identification.

Abstract:

This paper proposes a new smoothly clipped absolute deviation (SCAD) regularized recursive subspace model identification algorithm with square root(SR) extended instrumen...Show More

Abstract:

This paper proposes a new smoothly clipped absolute deviation (SCAD) regularized recursive subspace model identification algorithm with square root(SR) extended instrumental variable (EIV) and locally optimal variable forgetting factor (LOFF). The proposed algorithm is based on a new local polynomial modeling (LPM)-based variable FF EIV projection approximation subspace tracking (PAST) and multivariate recursive least squares algorithms for estimating respectively the subspace of the extended observability matrix, and the input matrix and feedthrough matrix. The SREIV-PAST algorithm offers improved resilience to additive noise over the PAST algorithm and it is implemented using the more stable square-root form. The asymptotic mean square error of the corresponding LPM-based model is derived and minimized at each time instant to obtain the proposed LOFF for improving the convergence speed and steady state error. A recursive bi-iteration singular value decomposition (SVD) algorithm is also proposed for recursive computation of the pseudo-inverse of the state transition matrix and its eigenvalues. This facilitates online estimation of its model order using classical model selection criteria. Moreover, a new criterion based on the percentage of explained variance is also proposed with improved performance. The SCAD regularization is proposed for automatic model selection of the input and feedthrough matrices, as it is asymptotically unbiased. Efficient techniques for incorporating SCAD and estimating other required quantities are also developed. The proposed algorithms are evaluated using computer simulations under stationary and nonstationary environments and a real dataset on wing flutter data. Results show that the proposed algorithms offer improved convergence speed and steady state performance over the conventional algorithms and it also provides an online estimate of the system model order.
Recursive Subspace Model Identification.
Published in: IEEE Access ( Volume: 8)
Page(s): 43520 - 43536
Date of Publication: 10 March 2020
Electronic ISSN: 2169-3536

CCBY - IEEE is not the copyright holder of this material. Please follow the instructions via https://creativecommons.org/licenses/by/4.0/ to obtain full-text articles and stipulations in the API documentation.
SECTION I.

Introduction

Subspace-based approaches play an important role in system identification [1]–​[4] due to its theoretical and practical convenience for building state-space models directly from the input-output data of systems. Offline subspace identification techniques are characterized by the use of robust numerical tools such as the RQ factorization and the singular values decomposition (SVD) of the data matrix [5]. However, the batch subspace model identification (SMI) algorithms are computational intensive for online implementation due to the high computational complexity of SVD. Consequently, much research on recursive subspace model identification (RSI) [6]–​[9] has been devoted to update the model parameters over time with a reduced computational cost. Most RSI algorithms estimate the unknown state matrix \boldsymbol {A} and the output matrix \boldsymbol {C} from the “data equation” consisting of the input and output samples of the systems. This requires the estimation of the subspace of the extended observability matrix, which requires SVD [10] or the use of more efficient subspace tracking algorithms [11], [12]. The computation of the state matrix \boldsymbol {A} also requires the evaluation of a pseudo-inverse and few works discussed the online estimation of the model order since it involves the computation of its eigenvalues. On the other hand, the input matrix \boldsymbol {B} , and feedthrough matrix \boldsymbol {D} can be estimated using the RLS algorithm with multiple outputs. L1 regularization has been proposed for automatic selection of their orders by shrinkage [13].

In the context of recursive subspace tracking, the projection approximation subspace tracking (PAST) algorithm [14] has been shown to yield good performance and efficient implementation for tracking the subspace spanned by the observability matrix. The PAST algorithm is based on the use of the recursive least squares (RLS) algorithm and the projection approximation. It assumes that the subspace is slowly time-varying and a fixed forgetting factor (FF) is usually used. Therefore, its tracking performance in time-varying systems or sudden system changes may be degraded. Furthermore, a drawback of classical recursive SMI [15]–​[17] algorithms is that the disturbances acting on the system output are required to be spatially and temporally white. Biased estimates will result if such assumption is violated. The instrumental variable (IV) method [18]–​[21] is an effective approach to tackle these types of noise problems by exploring the cross-correlation between the instrumental variables with the input variables rather than the covariance of the inputs as in conventional least squares method. The extension of IV to the PAST algorithm, namely the extended IV PAST (EIV-PAST) method, has been proposed in [22] to address the problem of biased estimates in the RSI algorithm when the output disturbances are not spatially and temporally white.

It is known that the EIV-PAST algorithm is able to achieve a faster convergence speed and smaller mean square error (MSE) in slowly time-varying environment if a large FF is utilized. On the other hand, a relatively small FF should be used to facilitate the tracking of fast time-varying parameters. In the context of RLS algorithms, Thus, a number of variable FF-based (VFF) algorithms have been proposed [23]–​[26] to achieve a satisfactory performance in stationary, slowly or fast time-varying environment. In particular, a local polynomial modeling (LPM)-based VFF-RLS algorithm is recently developed in [26] by minimizing the MSE and its performance compares favorably with other conventional VFF-RLS algorithms tested. Note that these algorithms are applicable to the case of single output and rely on the Wiener solution involving the covariance of the inputs. On the other hand, the EIV-PAST involves multiple outputs and a system equation involving the cross-correlation between the input signal and the instrumental variables. Therefore, it is highly desirable to develop new VFF algorithms for better adaptation of the EIV-PAST algorithm in time-varying and noisy environment. Furthermore, it is important to be able to incorporate variable regularization in the EIV-PAST algorithms to improve their numerical behavior at low input signal level.

In this paper, we propose a new RSI algorithm where the subspace of the extended observability matrix (and hence the state matrix \boldsymbol {A} ) is estimated by a new LPM-based VFF square-root (SR) EIV-PAST algorithm while the input matrix \boldsymbol {B} and feedthrough matrix \boldsymbol {D} are estimated by a VFF multivariate RLS algorithm. A recursive bi-iteration SVD algorithm is also proposed for the recursive computation of the pseudo-inverse of \boldsymbol {A} and its eigenvalues. This allows us to estimate online the model order of \boldsymbol {A} using classical Akaike information criterion (AIC) and minimum description length (MDL) criteria. Moreover, a new model order selection criterion based on the percentage of explained variance is proposed. On the other hand, the smoothly clipped absolute deviation (SCAD) regularization (SC) is proposed for automatic selection for input matrix \boldsymbol {B} and feedthrough matrix \boldsymbol {D} , as it is asymptotically unbiased unlike the L1 regularization [27].

The proposed VFF-SREIV-PAST algorithm utilizes the LPM channel model [26] of time-varying channels so as to derive the MSE of the EIV linear model as a function of the forgetting factor. Moreover, the resultant MSE contains a bias term which increases with the FF and a variance term which decreases with the FF. Therefore, the asymptotic MSE can be minimized at each time instant to obtain the proposed locally optimized FF for improving the convergence speed and steady state error. Techniques for estimating the various quantities required will also be introduced. The resultant SREIV-PAST algorithm offers improved resilience to additive noise over the PAST algorithm and it is implemented using the more stable square-root form.

The proposed VFF multivariate RLS algorithm is an extension of the VFF-RLS algorithm for single output [26] to multiple outputs with SCAD regularization. The VFF speeds up the convergence of the multivariate RLS algorithm whilst the SCAD regularization helps to achieve automatic model selection and the reduction of the estimation variance arising from possibly ill-conditioned covariance matrix especially at low input signal level. This is very useful during signal fading where the covariance matrix may be significantly ill-conditioned. Efficient techniques for incorporating SCAD in the numerical more stable QR decomposition (QRD) and estimating other required quantities are also developed.

Different from the LPM-based VFF-RLS algorithm in [26], which deals with single output and autocovariance matrix in the Wiener filter, the proposed algorithms are applicable to complex inputs, single and multiple outputs, and EIV systems involving cross-correlation matrix. The proposed algorithms are evaluated using computer simulations under stationary and nonstationary environments and a real dataset on wing flutter data. Experimental results show that the proposed algorithm provides faster convergence performance and more accurate estimates than conventional algorithms tested. Moreover, it also provides the model order for the state transition matrix and automatic model order selection for the input and feedthrough matrices. The convergence of the VFF-SREIV-PAST and RLS algorithms are also analyzed.

In summary, the novelties of this paper are i) the development of a new recursive RSI system identification algorithm with automatic order estimation, ii) the algorithm is based on efficient LPM based VFF EIV-PAST and multi-variate RLS algorithms for improving the tracking speed and steady state error, and they can be realized using efficient square-root form based on hyperbolic rotations and conventional QR decomposition respectively with improved its numerical stability, iii) a recursive bi-iteration singular value decomposition (SVD) algorithm is proposed for extraction of singular values of the state transition matrix for automatic order selection, iv) a new criterion based on percentage of explained variance is proposed to facilitate online estimation of the model order of A, and v) the SCAD regularization is incorporated for automatic model selection of the input matrices B and D as it is asymptotically unbiased. The regularized EIV algorithm also helps to stabilize the subspace especially during signal fading.

The rest of the paper is organized as follows. Section II will briefly review the system model, problem formulation and related works. In Section III, the proposed algorithm is introduced. Numerical results and comparison with the conventional algorithms are presented in Section IV. It suggests that the proposed algorithm has a better performance than conventional and the state-of-the-art algorithms. Finally, Section V concludes the paper.

SECTION II.

Problem Formulation

A. System Model

Consider the following N -th length discrete-time linear time-invariant state-space model with M inputs and p outputs in innovation form \begin{align*} {\boldsymbol {x}}[n+1]=&{\boldsymbol {A}}{\boldsymbol {x}}[n]+{\boldsymbol {B}}{\boldsymbol {u}}[n],\tag{1a}\\ {\boldsymbol {y}}[n]=&{\boldsymbol {C}}{\boldsymbol {x}}[n]+{\boldsymbol {D}}{\boldsymbol {u}}[n]+{\boldsymbol {v}}[n]\tag{1b}\end{align*} View SourceRight-click on figure for MathML and additional features. where {\boldsymbol {x}}[n]\in {\boldsymbol {R}}^{N} , {\boldsymbol {u}}[n]\in {\boldsymbol {R}}^{M} and {\boldsymbol {y}}[n]\in {\boldsymbol {R}}^{p} denote respectively the finite dimensional state vector, input vector and output vector at time instant n . The unknown state matrix \boldsymbol {A} , input matrix \boldsymbol {B} , output matrix \boldsymbol {C} and feedthrough matrix \boldsymbol {D} have appropriate dimensions. The noise term {\boldsymbol {v}}[n] is assumed to be a zero mean white process uncorrelated with the input {\boldsymbol {u}}[n] . The system model is summarized in Fig. 1.

FIGURE 1. - Recuresive subspace model identification.
FIGURE 1.

Recuresive subspace model identification.

For notation convenience, let us define the following stacked p\alpha \times 1 input, output and noise vectors \begin{align*} {\boldsymbol {u}}_\alpha=&\left [{{\boldsymbol {u}}^{\mathsf {T}}[n],\ldots,{\boldsymbol {u}}^{\mathsf {T}}[n+\alpha -1]}\right]^{\mathsf {T}},\tag{2a}\\ {\boldsymbol {y}}_\alpha=&\left [{{\boldsymbol {y}}^{\mathsf {T}}[n],\ldots,{\boldsymbol {y}}^{\mathsf {T}}[n+\alpha -1]}\right]^{\mathsf {T}},\tag{2b}\\ {\boldsymbol {v}}_\alpha=&\left [{{\boldsymbol {v}}^{\mathsf {T}}[n],\ldots,{\boldsymbol {v}}^{\mathsf {T}}[n+\alpha -1]}\right]^{\mathsf {T}},\tag{2c}\end{align*} View SourceRight-click on figure for MathML and additional features. where \alpha is a user-defined integer larger than the model order. Consequently, the input and output samples can be written in the following vector form called the “data equation” [28] \begin{equation*} {\boldsymbol {y}}_\alpha [n]=\Gamma _\alpha {\boldsymbol {x}}[n]+\Phi _\alpha {\boldsymbol {u}}_\alpha [n]+{\boldsymbol {v}}_\alpha [n],\tag{3}\end{equation*} View SourceRight-click on figure for MathML and additional features. where \Gamma _\alpha is the extended observability matrix1 defined as \begin{equation*} \Gamma _\alpha =\left [{\Gamma _{0}^{\mathsf {T}}, \Gamma _{1}^{\mathsf {T}}, \ldots, \Gamma _{\alpha -1}^{\mathsf {T}}}\right]^{\mathsf {T}},\tag{4}\end{equation*} View SourceRight-click on figure for MathML and additional features. \Gamma _{k}={\boldsymbol {C}}{\boldsymbol {A}}^{k} , and \Phi _\alpha is a block lower triangular Toeplitz matrix consisting of the impulse response from the input to output given by \begin{equation*} \Phi _\alpha = \left [{ \begin{matrix} {\boldsymbol {D}}& {\boldsymbol {0}}& \cdots & {\boldsymbol {0}}\\ {\boldsymbol {C}}{\boldsymbol {B}}& {\boldsymbol {D}}& \cdots & {\boldsymbol {0}}\\ \vdots & \ddots & \ddots & \vdots \\ {\boldsymbol {C}}{\boldsymbol {A}}^{\alpha -2}{\boldsymbol {B}}& \cdots & {\boldsymbol {C}}{\boldsymbol {B}}& {\boldsymbol {D}}\end{matrix} }\right].\tag{5}\end{equation*} View SourceRight-click on figure for MathML and additional features. It should be noted that \Gamma _\alpha spans an n -dimensional signal subspace, which contains the part of the output that is due to the state vector.

B. Recursive Estimation of A and C

One can notice from (4) that, the matrix \boldsymbol {C} can be obtained from the first p\times n submatrix \Gamma _{0} of \Gamma _\alpha . On the other hand, two consecutive submatrices \Gamma _{k+1} and \Gamma _{k} of \Gamma _\alpha differ only by \boldsymbol {A} on its right hand side: \Gamma _{k+1}=\Gamma _{k}{\boldsymbol {A}} . Therefore, given \Gamma _\alpha , one can solve for \boldsymbol {A} as follows \begin{equation*} \hat {{\boldsymbol {A}}}[n]=\Gamma _{a}^\dagger \Gamma _{b}\tag{6}\end{equation*} View SourceRight-click on figure for MathML and additional features. where \Gamma _{a}=[\Gamma _{0}^{\mathsf {T}}, \Gamma _{1}^{\mathsf {T}}, \ldots, \Gamma _{\alpha -2}^{\mathsf {T}}]^{\mathsf {T}} , \Gamma _{b}=[\Gamma _{1}^{\mathsf {T}}, \Gamma _{2}^{\mathsf {T}}, \ldots, \Gamma _{\alpha -1}^{\mathsf {T}}]^{\mathsf {T}} and the superscript (\cdot)^\dagger denotes the Moore-Penrose pseudo-inverse.

Since, the state space matrix is uniquely identifiable up to an orthogonal transformation, it is sufficient to obtain a subspace that span the column space of \Gamma _\alpha . This problem is very similar to the following formulation of the direction of arrival (DOA) problem in array signal processing [29] \begin{equation*} {\boldsymbol {z}}[n]=\Gamma [\theta]{\boldsymbol {x}}[n]+{\boldsymbol {e}}[n],\tag{7}\end{equation*} View SourceRight-click on figure for MathML and additional features. where {\boldsymbol {z}}[n]\in {\boldsymbol {C}}^{m} , {\boldsymbol {x}}[n]\in {\boldsymbol {C}}^{n} and {\boldsymbol {e}}[n]\in {\boldsymbol {C}}^{m} represent respectively the sensor signal vector from the m sensors, the n source signals impinging the array, and additive noise. \Gamma [\theta] is the m\times n steering vector matrix with the j -th column containing the steering vector (array response) of the j -th source. By writing {\boldsymbol {z}}[n]={\boldsymbol {y}}_\alpha [n]-\Phi _\alpha {\boldsymbol {u}}_\alpha , one can see that the two models are identical with \Gamma [\theta] equals to \Gamma _\alpha and {\boldsymbol {v}}_\alpha [n] equals to {\boldsymbol {e}}[n] . This yields \begin{equation*} {\boldsymbol {z}}[n]=\Gamma _\alpha {\boldsymbol {x}}[n]+{\boldsymbol {v}}_\alpha [n],\tag{8}\end{equation*} View SourceRight-click on figure for MathML and additional features. where {\boldsymbol {z}}[n]={\boldsymbol {y}}_\alpha [n]-\Phi _\alpha {\boldsymbol {u}}_\alpha [n] . Since \Phi _\alpha is not available at snapshot n , it can be approximated as \Phi _\alpha [n-1] consisting of previously estimated {\boldsymbol {A}}[n-1] , {\boldsymbol {B}}[n-1] , {\boldsymbol {C}}[n-1] and {\boldsymbol {D}}[n-1] . Another approach, which is called the past inputs (PI) scheme, is to make use of the classical Givens rotations to annihilate the input {\boldsymbol {u}}_\alpha [n] from the output {\boldsymbol {y}}_\alpha [n] via the RQ (or QR) factorization.\begin{align*}&\left [{ \begin{matrix} {\boldsymbol {R}}_{11}[n+1] & {\boldsymbol {0}}& {\boldsymbol {0}}\\ {\boldsymbol {R}}_{21}[n+1] & {\boldsymbol {R}}_{22}[n+1] & {\boldsymbol {z}}[n] \\ \end{matrix} }\right] \\&\qquad \qquad \qquad \quad {{{={\boldsymbol {G}}[n] \left [{ \begin{matrix} {\boldsymbol {R}}_{11}[n] & {\boldsymbol {0}}& {\boldsymbol {u}}_\alpha [n] \\ {\boldsymbol {R}}_{21}[n] & {\boldsymbol {R}}_{22}[n] & {\boldsymbol {y}}_\alpha [n] \\ \end{matrix} }\right],} }}\tag{9}\end{align*} View SourceRight-click on figure for MathML and additional features. where {\boldsymbol {G}}[n] is an appropriate orthonormal rotor, {\boldsymbol {R}}_{11}\in {\boldsymbol {C}}^{(M\alpha)\times (M\alpha)} and {\boldsymbol {R}}_{22}\in {\boldsymbol {C}}^{(p\alpha)\times (p\alpha)} are upper right triangular matrix and {\boldsymbol {R}}_{21}\in {\boldsymbol {C}}^{(p\alpha)\times (M\alpha)} .

In array signal processing and related applications, very efficient subspace tracking algorithms [14], [22], [30]–​[32] have been developed to recursively update the column subspace of \Gamma [\theta] for each input signal vector. Therefore, [33], [34] have proposed to estimate the column space of \Gamma _\alpha using the PAST subspace tracking algorithm and its extensions. To this end, one usually considers the following model for {\boldsymbol {z}}[n] \begin{equation*} {\boldsymbol {z}}[n]={\boldsymbol {W}}[n]{\boldsymbol s}[n]+{{\eta}}[n],\tag{10}\end{equation*} View SourceRight-click on figure for MathML and additional features. where {\boldsymbol {W}}[n]\in {\boldsymbol {C}}^{(p\alpha)\times K} is the subspace matrix with rank K , {\boldsymbol s}[n]={\boldsymbol {W}}^{\mathsf {H}}[n]{\boldsymbol {z}}[n] is the projection of {\boldsymbol {z}}[n] onto the subspace and {{\eta}}[n] is the additive noise. Most subspace tracking algorithms such as PAST aim to find a matrix {\boldsymbol {W}}[n] which spans the signal subspace by minimizing the following LS error of the reconstruction {\boldsymbol {z}}[n]-{\boldsymbol {W}}[n]{\boldsymbol s}[n]={{\eta}}[n] over time:\begin{equation*} \mathbb {E}_{LS}({\boldsymbol {W}}[n])=\sum \nolimits _{m=1}^{n} \lambda ^{n-m} \Vert {\boldsymbol {z}}[m]-{\boldsymbol {W}}[n]{\boldsymbol {g}}[m]\Vert _{2}^{2}\tag{11}\end{equation*} View SourceRight-click on figure for MathML and additional features. where \lambda \in (0,1] is a positive forgetting factor. Moreover, the projection approximation, {\boldsymbol {g}}[m]={\boldsymbol {W}}^{\mathsf {H}}[n-1]{\boldsymbol {z}}[m] , is frequently used to simply the minimization of (11), since (11) can then be reduced to a conventional recursive least squares (RLS) problem in {\boldsymbol {W}}[n] . Consequently, {\boldsymbol {W}}[n] can be used to estimate the column space of \Gamma _\alpha .

The use of the LS criterion in (11) implicitly assumes that the additive noise {{\eta}}[n] is spatially white. However, in the context of system identification, the additive noise {\boldsymbol {v}}_\alpha [n] may not be spatially white. Consequently, the estimate tends to be biased if the noise is not spatially white [33]. In [33], a novel subspace tracking algorithm based on the concept of instrumental variable (IV) is proposed. Instead of minimizing the LS error of {{\eta}}[n] , an appropriate IV vector which is uncorrelated with {\boldsymbol {v}}_\alpha [n] and highly correlated with {\boldsymbol {x}}[n] is used to eliminate the effects of the additive noise. The selection of the instrumental variable is problem dependent. Usually, it is chosen as neighboring signal vector over time to ensure that the noise samples are uncorrelated.

Assuming that there exists such an IV vector {{\xi}}[n]\in {\boldsymbol {C}}^{l} with l\geq n such that

  1. \mathbb {E}[{\boldsymbol {v}}_\alpha [n]{{\xi}}^{\mathsf {H}}[n]]={\boldsymbol {0}} ,

  2. \text {Rank}({\boldsymbol {C}}_{x\xi })=\text {Rank}\{\mathbb {E}[{\boldsymbol {x}}[n]{{\xi}}^{\mathsf {H}}[n]]\}=N

one can then multiply both sides of (8) by {{\xi}}^{\mathsf {H}}[n] . After taking expectation, and using the above two assumptions, one gets \begin{align*} \mathbb {E}[{\boldsymbol {z}}[n]{{\xi}}^{\mathsf {H}}[n]]=&\Gamma _\alpha \mathbb {E}[{\boldsymbol {x}}[n]{{\xi}}^{\mathsf {H}}[n]]+\mathbb {E}[{\boldsymbol {v}}_\alpha [n]{{\xi}}^{\mathsf {H}}[n]] \\=&\Gamma _\alpha \mathbb {E}[{\boldsymbol {x}}[n]{{\xi}}^{\mathsf {H}}[n]].\tag{12}\end{align*} View SourceRight-click on figure for MathML and additional features.

It can be seen that assumption (C1) ensures that the IV vector {{\xi}}[n] is uncorrelated with the noise vector {\boldsymbol {v}}_\alpha [n] so that the noise term on the right hand side is eliminated. This mitigates the effect of noise (which may also be correlated with {\boldsymbol {x}}[n] ) and gives rise to a consistent unbiased estimate even for spatially color disturbance. On the other hand, (C2) is required so that the rank of \Gamma _\alpha is preserved in \mathbb {E}[{\boldsymbol {z}}[n]{{\xi}}^{\mathsf {H}}[n]] . Therefore, the space spanned by the columns of \Gamma _\alpha {\boldsymbol {C}}_{x\xi } is equal to the columns of \Gamma _\alpha , i.e. \text {span}(\Gamma _\alpha {\boldsymbol {C}}_{x\xi })=\text {span}(\Gamma _\alpha) . If {\boldsymbol {x}}[n] and {{\xi}}[n] have the same dimension, then \mathbb {E}[{\boldsymbol {x}}[n]{{\xi}}^{\mathsf {H}}[n]] is nonsingular and we have \Gamma _\alpha =\mathbb {E}[{\boldsymbol {z}}[n]{{\xi}}^{\mathsf {H}}[n]]\mathbb {E}[{\boldsymbol {x}}[n] {{\xi}}^{\mathsf {H}}[n]]^{-1} . Since {\boldsymbol {x}}[n] is not assessable, we resort to its alternative form in (10) where {\boldsymbol s}[n] is now a rotated form of {\boldsymbol {x}}[n] . In general, there are many ways to construct instrumental variables (IVs) in IV method. For instance, additional prefiltering may be performed to derive IV for system identification and other applications [35]. The optimal instruments and prefilters have been studied in [36]. It turns out that the optimal instruments and prefiltering will depend on the true system dynamics and the statistics properties of the noise. A multistep algorithm has been proposed in [36]. In this paper, we mainly focus on the temporal IV implementation, which is chosen as time delayed measurements. Moreover, to avoid further iterations and filtering over the temporal snapshots, we do not consider prefiltering the snapshots in exchange for a simple recursive implementation. Using the same approach above with the projection approximation {\boldsymbol {g}}[m]={\boldsymbol {W}}^{\mathsf {H}}[n-1]{\boldsymbol {z}}[m] , one gets \begin{align*} {\boldsymbol {C}}_{z\xi }=&\mathbb {E}[{\boldsymbol {z}}[n]{{\xi}}^{\mathsf {H}}[n]] \\=&{\boldsymbol {W}}\cdot \mathbb {E}[{\boldsymbol {g}}[n]{{\xi}}^{\mathsf {H}}[n]]+\mathbb {E}[{{\eta}}[n]{{\xi}}^{\mathsf {H}}[n]] \\=&{\boldsymbol {W}}\cdot {\boldsymbol {C}}_{h\xi },\tag{13}\end{align*} View SourceRight-click on figure for MathML and additional features. and an estimate of is \Gamma _\alpha where {\boldsymbol {W}}={\boldsymbol {C}}_{z\xi }[n]{\boldsymbol {C}}_{g\xi }^{-1} , {\boldsymbol {C}}_{z\xi }=\mathbb {E}[{\boldsymbol {z}}[n]{{\xi}}^{\mathsf {H}}[n]] and {\boldsymbol {C}}_{g\xi }=\mathbb {E}[{\boldsymbol {g}}[n]{{\xi}}^{\mathsf {H}}[n]] . This is called the simple IV method. In the extended IV method, the size of {{\xi}}[n] is longer than {\boldsymbol {x}}[n] , which leads to an over-determined system. One can then resort to the LS solution, which can be obtained by multiplying both sides of (13) by {\boldsymbol {C}}_{g\xi }^{\mathsf {H}} to obtain the following normal equation for solving \boldsymbol {W} :\begin{equation*} ({\boldsymbol {C}}_{z\xi }{\boldsymbol {C}}_{g\xi }^{\mathsf {H}})={\boldsymbol {W}}\cdot ({\boldsymbol {C}}_{g\xi }{\boldsymbol {C}}_{g\xi }^{\mathsf {H}}).\tag{14}\end{equation*} View SourceRight-click on figure for MathML and additional features. Note that the system matrix ({\boldsymbol {C}}_{g\xi }{\boldsymbol {C}}_{g\xi }^{\mathsf {H}}) is symmetric and also nonsingular if (A2) is met under persistent excitation.

In practice, {\boldsymbol {C}}_{z\xi } and {\boldsymbol {C}}_{\xi g} need to be recursively estimated and tracked from data using an exponentially weighting scheme as follows \begin{align*} {\boldsymbol {C}}_{Z\Xi }[n]=&{\boldsymbol {Z}}^{\mathsf {H}}[n]{\boldsymbol {D}}_\lambda [n]{{\Xi}}[n] \\=&\sum \nolimits _{i=1}^{t}\lambda ^{t-i}{\boldsymbol {z}}[i]{{\xi}}^{\mathsf {H}}[i] \\=&\lambda {\boldsymbol {C}}_{Z\Xi }[n-1]+{\boldsymbol {z}}[n]{{\xi}}^{\mathsf {H}}[n], \tag{15a}\\ {\boldsymbol {C}}_{\Xi G}[n]=&{\boldsymbol {C}}_{G\Xi }^{\mathsf {H}}[n] ={{\Xi}}^{\mathsf {H}}[n]{\boldsymbol {D}}_\lambda [n]{\boldsymbol {G}}[n] \\=&\sum \nolimits _{i=1}^{t}\lambda ^{t-i}{{\xi}}[i]{\boldsymbol {g}}^{\mathsf {H}}[i] \\=&\lambda {\boldsymbol {C}}_{\Xi G}[n-1]+{{\xi}}[n]{\boldsymbol {g}}^{\mathsf {H}}[n],\tag{15b}\end{align*} View SourceRight-click on figure for MathML and additional features. where {\boldsymbol {Z}}[n]=\left [{{\boldsymbol {z}}[n],{\boldsymbol {z}}[n-1],\ldots,{\boldsymbol {z}}[{1}]}\right]^{\mathsf {H}} , {\boldsymbol {D}}_\lambda [n]=\text {diag}(1,\lambda,\ldots,\lambda ^{t-1}) , {{\Xi}}[n]=\left [{{{\xi}}[n],{{\xi}}[n-1],\ldots,{{\xi}}[{1}]}\right]^{\mathsf {H}} , {\boldsymbol {G}}[n]=\left [{{\boldsymbol {g}}[n],{\boldsymbol {g}}[n-1],\ldots,{\boldsymbol {g}}[{1}]}\right]^{\mathsf {H}} , and \lambda \in (0,1] is a positive forgetting factor (FF).

For the simple IV method, the inverse of {\boldsymbol {C}}_{\Xi G}[n] can be recursively updated from (15a) using the matrix inversion lemma. Hence, the solution of \boldsymbol {W} at time n , {\boldsymbol {W}}[n] , can be obtained as \begin{equation*} {\boldsymbol {W}}[n]={\boldsymbol {C}}_{Z\Xi }[n]{\boldsymbol {C}}_{\Xi G}^{-\mathsf {H}}[n].\tag{16}\end{equation*} View SourceRight-click on figure for MathML and additional features. For the EIV method, (14) can be similarly solved using the RLS algorithm as shown in Table 1 [22], [33], [37]. However, since the conditioning of the system matrix \hat {{\boldsymbol {C}}}_{G\Xi }[n]\hat {{\boldsymbol {C}}}_{\Xi G}[n] is deteriorated by the squaring operation, the accuracy of the resulting RLS solution may be compromised. We found that a recursive EIV algorithm based on the factorization of \hat {{\boldsymbol {C}}}_{G\Xi }[n]\hat {{\boldsymbol {C}}}_{\Xi G}[n] and the Hyperbolic transformation was proposed in [38]. Due to the square root (SR) operation, the condition number of the system is considerably improved. Therefore, we have utilized this algorithm for estimating {\boldsymbol {W}}[n] in the proposed subspace-based system identification algorithm. The resultant algorithm based on the SREIV-PAST is summarized in Table 2. Interested readers are referred to [38] for the detailed derivations of the algorithm. Moreover, in Section III, we shall introduce a new VFF SREIV-PAST algorithm for estimating {\boldsymbol {W}}[n] in time-varying subspace, in which the FF is chosen to minimize the mean square error or deviation (MSD) of the estimator.

TABLE 1 The ECF-ICF-RSI Algorithm [22]
Table 1- 
The ECF-ICF-RSI Algorithm [22]
TABLE 2 The SR-ECF-ICF-RSI Algorithm [38]
Table 2- 
The SR-ECF-ICF-RSI Algorithm [38]

Similarly to (6), {\boldsymbol {A}}[n] and {\boldsymbol {C}}[n] can be extracted from {\boldsymbol {W}}[n] as follows \begin{align*} \hat {{\boldsymbol {A}}}[n]=&{\boldsymbol {W}}_{1:p(\alpha -1)}^\dagger [n]{\boldsymbol {W}}_{(p+1):p\alpha }[n], \tag{17a}\\ \hat {{\boldsymbol {C}}}[n]=&{\boldsymbol {W}}_{1:p}[n]. \tag{17b}\end{align*} View SourceRight-click on figure for MathML and additional features. where the subscript (\cdot)_{1:p} represents the first row to the p -th row. Alternatively, the pseudo inverse (\cdot)^\dagger can be recursively estimated with lower complexity by applying the bi-iteration singular value decomposition (SVD) [30] to the matrix {\boldsymbol {W}}_{1:p(\alpha -1)} . More specifically, the iterations reads.

For l=1,2,\ldots until convergence iterate:\begin{align*} {\boldsymbol {X}}_{B}^{(l)}=&{\boldsymbol {W}}_{1:p(\alpha -1)}{\boldsymbol {Q}}_{A}^{(l)} \\ {\boldsymbol {X}}_{B}^{(l)}=&{\boldsymbol {Q}}_{B}^{(l)}{\boldsymbol {R}}_{B}^{(l)} \qquad (\text {QR}\quad \text {factorization}) \\ {\boldsymbol {X}}_{A}^{(l)}=&{\boldsymbol {W}}_{1:p(\alpha -1)}^{\mathsf {H}}{\boldsymbol {Q}}_{B}^{(l)} \\ {\boldsymbol {X}}_{A}^{(l)}=&{\boldsymbol {Q}}_{A}^{(l+1)}{\boldsymbol {R}}_{A}^{(l)} \quad (\text {QR}\quad \text {factorization})\tag{18}\end{align*} View SourceRight-click on figure for MathML and additional features. In this recurrence,{\boldsymbol {Q}}_{A}^{(l)} and {\boldsymbol {Q}}_{B}^{(l)} will converge toward the N dominant left and right singular vectors of {\boldsymbol {W}}_{1:p(\alpha -1)} , respectively. Furthermore, both triangular matrices {\boldsymbol {R}}_{A}^{(l)} and {\boldsymbol {R}}_{B}^{(l)} will approach the N\times N diagonal matrix composed of dominant singular values of {\boldsymbol {W}}_{1:p(\alpha -1)} . Instead of using the conventional identity matrix, the converged {\boldsymbol {Q}}_{A} at time instant n can be used as the initial value {\boldsymbol {Q}}_{A}^{(0)} for the next time instant n+1 . Simulation results in section IV show that the number of iterations, can be significantly reduced over the conventional bi-iteration SVD algorithm. Usually, the iteration will converge in around five iterations.

(17a) can be recursively solved with the help of (18) using the multivariate complex RLS algorithm [32]. In Section III-C, we shall discuss how the model order of {\boldsymbol {A}}[n] and {\boldsymbol {C}}[n] can be recursively estimated.

C. Recursive Estimation of B and D

Given \hat {{\boldsymbol {A}}}[n] and \hat {{\boldsymbol {C}}}[n] , the remaining system matrices \boldsymbol {B} and \boldsymbol {D} can be estimated from the following linear regression according to [39] \begin{equation*} {\boldsymbol {y}}[n]=\Phi ^{\mathsf {H}}[n]{{\theta}}[n]+{\boldsymbol {v}}[n],\tag{19}\end{equation*} View SourceRight-click on figure for MathML and additional features. where \begin{align*} {{\theta}}[n]=&\left [{\text {vec}^{\mathsf {T}}(\boldsymbol{D}) \quad \text {vec}^{\mathsf {T}}(\boldsymbol{B})}\right]^{\mathsf {T}}, \tag{20a}\\ \Phi ^{\mathsf {H}}[n]=&\left[{({\boldsymbol {u}}^{\mathsf {H}}[n]\otimes {\boldsymbol {I}}_{p}) \quad \left({\sum \nolimits _{s=1}^{t-1}{\boldsymbol {u}}^{\mathsf {H}}[s]\otimes \hat {{\boldsymbol {C}}}\hat {{\boldsymbol {A}}}^{t-s-1}}\right)}\right].\tag{20b}\end{align*} View SourceRight-click on figure for MathML and additional features. and \text {vec}(\cdot) denotes the vectorization operator, \otimes is the Kronecker matrix product. The optimal weight vector therefore satisfies the following normal equation \begin{equation*} {\boldsymbol {R}}_{\Phi \Phi }[n]{{\theta}}_{opt}[n]={\boldsymbol {r}}_{\Phi Y}[n],\tag{21}\end{equation*} View SourceRight-click on figure for MathML and additional features. where {\boldsymbol {R}}_{\Phi \Phi }[n]=\sum \nolimits _{i=1}^{n}\lambda ^{n-i}\Phi [i]\Phi ^{\mathsf {H}}[i] and {\boldsymbol {r}}_{\Phi Y}[n]=\sum \nolimits _{i=1}^{n}\lambda ^{n-i}\Phi [i]{\boldsymbol {y}}[i] . Hence, \hat {{{\theta}}}[n] and hence \hat {{\boldsymbol {B}}}[n] and \hat {{\boldsymbol {D}}}[n] can be recursively updated by the multivariate RLS algorithm [40].

In this work, we shall utilize the complex VFF-RLS algorithm with multiple outputs in [32] to estimate the matrix \hat {{\boldsymbol {B}}}[n] and \hat {{\boldsymbol {D}}}[n] . It is based on the QR decomposition and the single output VFF-RLS algorithm proposed in [26] with real-valued input. The algorithm is summarized in Table 3.

TABLE 3 The QRD-Based SC-ELF-ILF-RSI Algorithm
Table 3- 
The QRD-Based SC-ELF-ILF-RSI Algorithm

SECTION III.

The Proposed SC-VFF-SREIV and -RLS RSI Algorithm

A. The Proposed VFF-SREIV-PAST Algorithm

In the proposed VFF-SREIV-PAST algorithm for tracking the subspace of \Gamma _\alpha , the FF is chosen to optimize the MSD of the estimator. Therefore, we need to obtain the mathematical expression of the MSD in terms of the FF used. More precisely, we wish to analyze the MSD of the SREIV-PAST algorithm for a time-varying subspace with coefficients being modelled by a local polynomial [26], [32]. We shall first show that the MSD consists of a bias term which increases with the FF and a variance term which decreases with the FF. Therefore, it is possible to select an appropriate FF which minimizes the MSD. Then, we shall discuss the estimation of the quantities required for computing the FF. Finally, the SCAD regularization is incorporated to the VFF-SREIV-PAST algorithm to reduce the estimation variance of the subspace during signal fading.

1) Local Polynomial Modeling

From (11), we can observe that the k -th element of {\boldsymbol {z}}[m] , {\boldsymbol {z}}_{k}[m] , is equal to {\boldsymbol {w}}_{k}^{\mathsf {T}}[m]{\boldsymbol s}[m] , where {\boldsymbol {w}}_{k}[m] is the k -th row of {\boldsymbol {W}}[n] . To simplify the analysis and derivation, we assume that the underlying time varying vector {\boldsymbol {h}}_{k} that {\boldsymbol {w}}_{k}[n] is tracking admits a local first order polynomial expansion at time index t_{n} as follows [32] \begin{equation*} {\boldsymbol {h}}_{k}(t_{m})={\boldsymbol {h}}_{k}(t_{n})+\frac {1}{1!}{\boldsymbol {h}}_{k}^{(1)}(t_{n})(t_{m}-t_{n})+{\boldsymbol {r}}_{k}(t_{m}-t_{n})\tag{22}\end{equation*} View SourceRight-click on figure for MathML and additional features. where t_{m} belongs to an appropriately close neighborhood B(t_{n}) of t_{n} , {\boldsymbol {h}}_{k}^{(1)}(t_{n}) is the derivative of {\boldsymbol {h}}_{k}(t) at time t_{n} and {\boldsymbol {r}}_{k}(t_{m}-t_{n}) is the remainder. Both {\boldsymbol {h}}_{k}^{(1)}(t_{n}) and {\boldsymbol {r}}_{k}(t_{m}-t_{n}) can be considered as random vectors and the latter is of order \mathcal {O}(t_{m}-t_{n}) , which implies that \Vert {\boldsymbol {r}}_{k}(t_{m}-t_{n})(t_{m}-t_{n})^{-1}\Vert _{2} tends to zero as \vert t_{m}-t_{n}\vert \to 0 . For simplicity, we also assume that {\boldsymbol {h}}_{k}^{(1)}(t_{n}) and {\boldsymbol {r}}_{k}(t_{m}-t_{n}) are stationary processes inside B(t_{n}) .

Substituting (22) into (8) with t_{n}=nT_{s} , where T_{s} is the sampling period, the k -th component of the vector \boldsymbol {z} observed at the m -th sampling instant can be written as \begin{align*} z_{k}^{*}[m]=&{\boldsymbol {h}}^{\mathsf {H}}[m]{\boldsymbol {h}}_{k}[m]+\eta _{k}[m] \\=&{\boldsymbol {h}}^{\mathsf {H}}[m]({\boldsymbol {h}}_{k}[n]\!+\!{\boldsymbol {h}}_{k}^{(1)}[n](m\!-\!n))\!+\!\eta _{k}[m]\!+\!\varsigma _{k}[m]\tag{23}\end{align*} View SourceRight-click on figure for MathML and additional features. where {\boldsymbol {h}}_{k}[n]={\boldsymbol {h}}_{k}(nT_{s}) , {\boldsymbol {h}}_{k}^{(1)}[n]={\boldsymbol {h}}_{k}^{(1)}(nT_{s})T_{s} , and \varsigma _{k}[m]={\boldsymbol {h}}^{\mathsf {H}}[m]{\boldsymbol {r}}_{k}[m-n] with {\boldsymbol {r}}_{k}[n]={\boldsymbol {r}}_{k}(nT_{s}) . \eta _{k}[m] denotes the additive noise which may arise from using the projection approximation or other sources.

Stacking the observed signal samples z_{k}[m] , m=n,\ldots,1 , we obtain the following model for the observation signal vector \begin{align*} {\boldsymbol {z}}_{k}^{*}[n]={\boldsymbol {G}}[n]{\boldsymbol {h}}_{k}[n]+{\boldsymbol {D}}_\tau [n]{\boldsymbol {G}}[n]{\boldsymbol {h}}_{k}^{(1)}[n]+{{\eta}}_{k}[n]+{{\varsigma}}_{k}[n], \\{}\tag{24}\end{align*} View SourceRight-click on figure for MathML and additional features. where {\boldsymbol {D}}_\tau [n]=\text {diag}({{\tau}}[n]) with {{\tau}}[n]=[0,-1,\ldots, -(n-1)]^{\mathsf {T}} , {{\varsigma}}_{k}[n]=\left [{\varsigma _{k}[n],\varsigma _{k}[n-1],\ldots,\varsigma _{k}[{1}]}\right]^{\mathsf {T}} is residual component vector due to higher order term, and {{\eta}}_{k}[n]=\left [{\eta _{k}[n],\eta _{k}[n-1],\ldots,\eta _{k}[{1}]}\right]^{\mathsf {T}} is the additive noise vector.

For simplicity of analysis, we assume that

  1. {\boldsymbol {r}}_{k}[m-n] , {{\eta}}_{k}[n] for k=1,\ldots,K , and {\boldsymbol {G}}[n] are independent.

  2. {\boldsymbol {r}}_{k}[m-n] and {{\eta}}_{k}[n] for k=1,\ldots,K , are zero mean with covariance respectively given by \mathbb{E}\left [{{{\eta}}_{k}[n]{{\eta}}_{k}^{\mathsf {H}}[n]}\right]=\sigma _{\eta _{k}}^{2}\delta [n-m]{\boldsymbol {I}} and \mathbb{E}\left [{{\boldsymbol {r}}_{k}[n]{\boldsymbol {r}}_{k}^{\mathsf {H}}[n]}\right]=\sigma _{r_{k}}^{\vphantom {D^{'}}2}\delta [n-m]{\boldsymbol {I}} , where \delta [n] is the unit impulse function and \boldsymbol {I} is the identity matrix with appropriate dimension.

  3. {\boldsymbol {h}}_{k}^{(1)}[n] has mean \mathbb{E}\left [{{\boldsymbol {h}}_{k}^{(1)}[n]}\right]\bar {{\boldsymbol {h}}}_{k}^{(1)}[n] and covariance \mathbb{E}\left [{\delta {\boldsymbol {h}}_{k}^{(1)}[n]\delta {\boldsymbol {h}}_{k}^{(1)H}[n]}\right]=\sigma _{h_{k}}^{2}[n]{\boldsymbol {I}} where {\boldsymbol {h}}_{k}^{(1)}[n]=\bar {{\boldsymbol {h}}}_{k}^{\vphantom {D^{'}}(1)}[n]+\delta {\boldsymbol {h}}_{k}^{(1)}[n] .

For remarks regarding the motivation behind these assumptions, interested readers are referred to [32] and the supplementary materials for more details. From the simulation results to be presented in Section IV, we found that the proposed VFF-SREIV-PAST algorithm, which is derived from the above assumption, performs satisfactorily both in the synthetic as well as real data. This suggests that the above assumptions are reasonable.

To start with, we substituting (24) into (16) and obtain the following expression of the EIV solution under the LPM time-varying channel model:\begin{equation*} {\boldsymbol {w}}_{k}[b]={\boldsymbol {h}}_{k}[n]+{\boldsymbol {B}}[n]{\boldsymbol {h}}_{k}^{(1)}[n]+{\boldsymbol {C}}[n]({{\eta}}_{k}[n]+{{\varsigma}}_{k}[n])\tag{25}\end{equation*} View SourceRight-click on figure for MathML and additional features. where {\boldsymbol {B}}[n]=({\boldsymbol {C}}_{\Xi G}^{\mathsf {H}}[n]{\boldsymbol {C}}_{\Xi G}[n])^{-1}{\boldsymbol {R}}_\tau [n] , {\boldsymbol {R}}_\tau [n]={\boldsymbol {G}}^{\mathsf {H}}[n]{\boldsymbol {D}}_\lambda [n]{{\Xi}}[n]{{\Xi}}^{\mathsf {H}}[n]{\boldsymbol {D}}_\lambda [n]{\boldsymbol {D}}_\tau [n]{\boldsymbol {G}}[n] and {\boldsymbol {C}}[n]=({\boldsymbol {C}}_{\Xi G}^{\mathsf {H}}[n]\,\,{\boldsymbol {C}}_{\Xi G}[n])^{-1}{\boldsymbol {G}}^{\mathsf {H}}[n]{\boldsymbol {D}}_\lambda [n]{{\Xi}}[n]{{\Xi}}^{\mathsf {H}}[n]{\boldsymbol {D}}_\lambda [n] . Moreover, we have used that fact that ({\boldsymbol {C}}_{\Xi G}^{\mathsf {H}}[n]{\boldsymbol {C}}_{\Xi G}[n])^{-1}{\boldsymbol {G}}^{\mathsf {H}}[n]\,\,{\boldsymbol {D}}_\lambda [n]{{\Xi}}[n]{{\Xi}}^{\mathsf {H}}[n]\,\,{\boldsymbol {D}}_\lambda [n]{\boldsymbol {G}}[n] is equal to the identity matrix.

2) VFF Recursive EIV Algorithm for Estimating A and C

The mean squares deviation (MSD) of the estimator {\boldsymbol {W}}[n] is equal to the sum of the MSE of all its components {\boldsymbol {w}}_{k}[n] . In Appendix, the latter is evaluated under the local polynomial time-varying model and hence the total MSD is given by (A.13) as follows \begin{equation*} J_{MSD}[n]\approx \frac {1-\lambda [n]}{1+\lambda [n]}\sigma _{\Sigma }^{2}\text {Tr}(\Omega)+\frac {\lambda ^{2}[n]\sigma _{h}^{2}[n]}{(1-\lambda [n])^{2}}.\tag{26}\end{equation*} View SourceRight-click on figure for MathML and additional features. where {{\Omega}}=({\boldsymbol {C}}_{\xi g}^{\mathsf {H}}{\boldsymbol {C}}_{\xi g})^{-1}{\boldsymbol {C}}_{\xi g}^{\mathsf {H}}{\boldsymbol {C}}_{\xi \xi }{\boldsymbol {C}}_{\xi g}({\boldsymbol {C}}_{\xi g}^{\mathsf {H}}{\boldsymbol {C}}_{\xi g})^{-H} . \sigma _\Sigma ^{2} and \sigma _{h}^{2} denote the total error variance and the variance of the true parameter, respectively.

By minimizing J_{MSD}[n] with respect to \lambda [n] , one can equate the partial derivative of (26) with respect to \lambda [n] to zero. This yields \begin{equation*} \frac {\partial J_{MSD}[n]}{\partial \lambda [n]}=-\frac {2\sigma _\Sigma ^{2}\text {Tr}(\Omega)}{(1+\lambda [n])^{2}}+\frac {2\lambda [n]\sigma _{h}^{2}[n]}{(1-\lambda [n])^{3}}.\tag{27}\end{equation*} View SourceRight-click on figure for MathML and additional features. which can be solved for the desired variable FF (VFF). To this end, we let \mu =(1+\lambda [n])/(1-\lambda [n]) so that (A.3) can be reduced to \mu ^{2}(\mu -1)\approx 2\sigma _\Sigma ^{2}[n]\text {Tr}(\Omega)/\sigma _{h}^{2}[n]\approx \mu ^{3} . Consequently, the VFF can be determined as follows \begin{align*} \lambda [n]=&\frac {\mu -1}{\mu +1}, \quad \text {if} \quad \lambda [n]>0 \\ \mu=&(2\sigma _\Sigma ^{2}[n]\text {Tr}(\Omega)/\sigma _{h}^{2}[n])^{1/3}.\tag{28}\end{align*} View SourceRight-click on figure for MathML and additional features. Note that \sigma _{h}^{2}=K\mathbb{E}\left [{\Vert \delta {\boldsymbol {h}}_{k}^{(1)}[n]\Vert _{2}^{2}}\right]+\sum \nolimits _{k=1}^{K}\bar {{\boldsymbol {h}}}_{k}^{(1)}[n]=\mathbb{E} \left [{\sum \nolimits _{k=1}^{K}{\boldsymbol {h}}_{k}^{(1)H}[n]{\boldsymbol {h}}_{k}^{(1)}[n]}\right]\vphantom {{\left [{\sum \nolimits _{k=1}^{K}{\boldsymbol {h}}_{k}^{(1)H}[n]{\boldsymbol {h}}_{k}^{(1)}[n]}\right]}^{'}} and they can be recursively estimated as \begin{align*}\hat {\sigma }_{h}^{2}[n+1] &=\lambda [n]\hat {\sigma }_{h}^{2}[n] \\&\qquad {{{+\,(1-\lambda [n])\sum \nolimits _{k=1}^{K}(\hat {{\boldsymbol {h}}}_{k}^{(1)H}[n]\hat {{\boldsymbol {h}}}_{k}^{(1)}[n]),} }}\tag{29}\end{align*} View SourceRight-click on figure for MathML and additional features. where \hat {{\boldsymbol {h}}}_{k}^{(1)}[n] is an estimate of {\boldsymbol {h}}_{k}^{(1)}[n] . For instance, \hat {{\boldsymbol {h}}}_{k}^{(1)}[n] can be estimated by the difference of the weight vector obtained at time instances n and n-1 as \hat {{\boldsymbol {h}}}_{k}^{(1)}[n]={\boldsymbol {w}}_{k}[n]-{\boldsymbol {w}}_{k}[n-1] . Hence, the system variance \sigma _{h}^{2} can be updated as \begin{equation*} \hat {\sigma }_{h}^{2}[n+1]=\lambda [n]\hat {\sigma }_{h}^{2}[n]+(1-\lambda [n])\Vert \Delta {\boldsymbol {W}}[n]\Vert _{F}^{2},\tag{30}\end{equation*} View SourceRight-click on figure for MathML and additional features. where \Delta {\boldsymbol {W}}[n]={\boldsymbol {W}}[n]-{\boldsymbol {W}}[n-1] . For the noise variance \sigma _\Sigma ^{2} , such information can be calculated from the error term {\boldsymbol {e}}[n] by using a relatively large FF as follows \begin{equation*} \hat {\sigma }_\Sigma ^{2}[n+1]=\lambda _{L}\hat {\sigma }_\Sigma ^{2}[n]+(1-\lambda _{L})\Vert {\boldsymbol {e}}[n]\Vert ^{2}.\tag{31}\end{equation*} View SourceRight-click on figure for MathML and additional features. Due to page limitation, the estimation of \Omega and other details are described in the supplementary material.

Using the VFF, one can vary the FF according to (28) in the recursive EIV algorithm in Tables 1 and 2 for tracking the subspace for computing \boldsymbol {A} and \boldsymbol {C} . It should be noted that both the SREIV-PAST in Table 2 and the EIV square root propagator method (EIV sqrtPM) proposed in [41] algorithms are derived from the classical work of Porat and Friedlander [38]. The EIVsqrtPM algorithm extends the algorithm in [38] to the multivariable case using a fixed FF and the proposed SREIV-PAST algorithm focuses on the complex-valued case and the adaption and realization of the variable FF and regularization parameter. The complex-valued case is motivated by the fact that observations in some communication systems are complex-valued. As mentioned in [42], the propagator method and the SREIV-PAST algorithm differ in the exact subspace vectors they are tracked. However, the subspace estimated are fundamentally identical. Therefore, their performances are very close to each other. The SREIV-PAST with a constant FF will exhibit a performance very similar to the one of the complex extension of the EIVsqrtPM in [41]. Therefore, we only compare our proposed algorithm with the SREIV-PAST algorithm, which is almost identical to the complex EIVsqrtPM algorithm.

The convergence of the recursive RSI algorithm has been studied in [42]. Since we are using a VFF SREIV-PAST algorithm instead of the conventional PAST algorithm, we will study the convergence of the variable forgetting factor (VFF) EIV method in the supplementary material using the ordinary differential equation (ODE) method [40]. It is found that under reasonable assumptions and conditions (C1) and (C2), the algorithm converges to the true subspace in stationary environment if \lambda [i]\in (0,1] is monotonic increasing and approaches a value of one. Like the VFF-PAST algorithm in [32], the asymptotic MSE is minimized via the forgetting factor in (28). As the recursive EIV algorithm is convergent if {\boldsymbol {C}}_{\xi g}^{\mathsf {H}}{\boldsymbol {C}}_{\xi g} is nonsingular, \sigma _{h}^{2} will decrease. Hence \mu and \lambda [n] will increase continuously. This satisfies the monotonic condition above if \lambda [n] is allowed to change smoothly so that the recursive EIV algorithm can converge gradually to a lower MSE. If \sigma _{h}^{2} is equal to zero, we immediately obtain \lambda [n]=1 , which is the desired result. However, in practice, \sigma _{h}^{2} is estimated online and it is unlikely to be zero and \lambda [n] will reach a value close to one at convergence. This is observed in the simulation results to be presented in Figs. 4 and 5 where the forgetting factor increases monotonically and reaches a value close to 0.99 at time instant 400. As suggested in [32], if a high accuracy is required in a stationary environment, one can detect the convergence of the algorithm and push the forgetting factor further.

FIGURE 2. - Estimated denominator coefficients using ECF-ICF-RSI in [22], and proposed ECF-ILF-RSI, ELF-ICF-RSI, ELF-ILF-RSI and SC-ELF-ILF-RSI algorithms under 17 dB SNR. ‘ECF’ and ‘ELF’ denote the SREIV-PAST algorithm with constant FF and local optimal FF, respectively. ‘ICF’ and ‘ILF’ represent the recursive algorithm to estimate the input-related matrices 
${B}$
 and 
${D}$
 with constant FF and local optimal FF. ‘SC’ is the abbreviation for SCAD regularization.
FIGURE 2.

Estimated denominator coefficients using ECF-ICF-RSI in [22], and proposed ECF-ILF-RSI, ELF-ICF-RSI, ELF-ILF-RSI and SC-ELF-ILF-RSI algorithms under 17 dB SNR. ‘ECF’ and ‘ELF’ denote the SREIV-PAST algorithm with constant FF and local optimal FF, respectively. ‘ICF’ and ‘ILF’ represent the recursive algorithm to estimate the input-related matrices {B} and {D} with constant FF and local optimal FF. ‘SC’ is the abbreviation for SCAD regularization.

FIGURE 3. - Estimated numerator coefficients using ECF-ICF-RSI in [22], and proposed ECF-ILF-RSI, ELF-ICF-RSI, ELF-ILF-RSI, SC-ELF-ILF-RSI, PI-ECF-ICF-RSI, PI-SC-ELF-ILF-RSI, algorithms under 17 dB SNR.
FIGURE 3.

Estimated numerator coefficients using ECF-ICF-RSI in [22], and proposed ECF-ILF-RSI, ELF-ICF-RSI, ELF-ILF-RSI, SC-ELF-ILF-RSI, PI-ECF-ICF-RSI, PI-SC-ELF-ILF-RSI, algorithms under 17 dB SNR.

FIGURE 4. - Change of the optimal forgetting factor in ELF-ILF-RSI algorithm.
FIGURE 4.

Change of the optimal forgetting factor in ELF-ILF-RSI algorithm.

FIGURE 5. - Change of the optimal forgetting factor in ELF-ILF-RSI algorithm.
FIGURE 5.

Change of the optimal forgetting factor in ELF-ILF-RSI algorithm.

For time-varying subspaces, the MSE mainly depends on the subspace variation and the estimate may not converge to a fixed value. In the supplementary material, the stability of the proposed VFF algorithm is analyzed. As expected, if the system is persistently excited, i.e. {\boldsymbol {C}}_{\xi g}^{\mathsf {H}}{\boldsymbol {C}}_{\xi g} is positive definite, under finite noise and channel variances, then the MSD of the proposed VFF algorithm as n\to \infty is bounded if 0< \lambda [n]< 1 .

Next, we consider the estimation of \boldsymbol {B} and \boldsymbol {D} using the VFF multivariate RLS algorithm.

B. The VFF RLS Algorithm for Estimating B and D

Since \boldsymbol {B} and \boldsymbol {D} can be estimated using the linear regression model in (19), it can be recursively estimated using the complex-valued LOFF RLS algorithm in [32]. Moreover, it can be implemented using the QRD as shown in step (vi) normal update of Table 3.

Moreover, the MSD under the LPM modeling of {{\theta}}[n] can be similarly obtained as follows \begin{equation*} \tilde {J}_{MSD}(n)\approx \frac {1-\tilde {\lambda }[n]}{1+\tilde {\lambda }[n]}\tilde {\sigma }_{\Sigma }^{2}\text {Tr}({\boldsymbol {R}}_{\Phi \Phi }^{-1})+\frac {\tilde {\lambda }^{2}[n]\sigma _\theta ^{2}[n]}{(1-\tilde {\lambda }[n])^{2}}.\tag{32}\end{equation*} View SourceRight-click on figure for MathML and additional features. with the following LOFF \begin{align*} \tilde {\lambda }_{opt}[n]=&\frac {\tilde {\mu }-1}{\tilde {\mu }+1}, \quad \text {if} \quad \tilde {\lambda }_{opt}[n]>0 \\ \tilde {\mu }=&(2\tilde {\sigma }_\Sigma ^{2}[n]\text {Tr}({\boldsymbol {R}}_{\Phi \Phi }^{-1})/\sigma _\theta ^{2}[n])^{1/3}.\tag{33}\end{align*} View SourceRight-click on figure for MathML and additional features. where \tilde {\sigma }_\Sigma ^{2}[n] and \sigma _\theta ^{2}[n] can be estimated as \tilde {\sigma }_\Sigma ^{2}[n+1]=\lambda _{L}\tilde {\sigma }_\Sigma ^{2}[n]+(1-\lambda _{L})\Vert \tilde {{\boldsymbol {e}}}\Vert ^{2} and \hat {\sigma }_\theta ^{2}[n+1]=\lambda [n]\hat {\sigma }_\theta ^{2}[n]+(1-\lambda [n])\Delta {{\theta}}^{\mathsf {H}}[n]\Delta {{\theta}}[n] where \tilde {{\boldsymbol {e}}}[n]={\boldsymbol {y}}-\Phi ^{\mathsf {H}}[n]{{\theta}}[n-1] and \Delta {{\theta}}[n]={{\theta}}[n]-{{\theta}}[n-1] .

C. Model Order Estimation for the Proposed LOFF-EIV Recursive Subspace Identification (RSI) Algorithms

1) The Proposed Model Order Estimation for A

With the LOFF-recursive EIV algorithm, we can estimate the subspace estimate {\boldsymbol {W}}[n] , and hence the matrices \boldsymbol {A} and \boldsymbol {C} . On the other hand, the matrices \boldsymbol {B} and \boldsymbol {D} can be estimated using the LOFF complex RLS algorithm.

In some real applications, one may wish to estimate the order of the matrix \boldsymbol {A} and hence \boldsymbol {C} . This can be gauged from the rank of \Gamma _\alpha , which in turns is related to the number of major eigenvalue of {\boldsymbol {C}}_{z\xi } . Since we have already estimated the K dimension subspace {\boldsymbol {W}}[n] , one can further estimate recursively the eigenvalues associated with {\boldsymbol {W}}[n] . Consequently, we can retain the major eigenvalues and estimate the model order of \boldsymbol {A} . More precisely, the singular values of \boldsymbol {A} can be computed recursively via the bi-iteration SVD in (18) with \Gamma _\alpha replaced by \boldsymbol {A} , instead of the batch SVD algorithm which requires \mathcal {O}(N^{3}) arithmetic complexity. The corresponding matrices {\boldsymbol {Q}}_{A}^{(l)} and {\boldsymbol {Q}}_{B}^{(l)} will then converge toward the N dominant left and right singular vectors of \boldsymbol {A} , respectively. Furthermore, both triangular matrices {\boldsymbol {R}}_{A}^{(l)} and {\boldsymbol {R}}_{B}^{(l)} will approach the N\times N diagonal matrix composed of dominant singular values of \boldsymbol {A} . Usually, the diagonal values of {\boldsymbol {R}}_{A}^{(l)} are chosen as the dominant singular values. Instead of using the conventional identity matrix as the initial value {\boldsymbol {Q}}_{A}^{(0)} , the converged {\boldsymbol {Q}}_{A} at time instant n can be used for {\boldsymbol {Q}}_{A}^{(0)} in the next time instant n+1 . Simulation results in section IV show that the number of iterations can be significantly reduced from five to one over the conventional bi-iteration SVD algorithm.

Classical model order estimation algorithm such as Akaike Information Criterion (AIC) and Minimum Description Length (MDL) usually assume that the minor singular values are equal. However, this assumption may not always be met in practice. Here, we shall investigate another hypothesis proposed in [43] which aims to find the model order such that the proportion of variance explained by the minor components is less than a certain value.

Let l_{1},l_{2},\ldots,l_{P} be the eigenvalues of \boldsymbol {A} , and denote the following estimated proportion of variance in the first K principal signal subspace \begin{equation*} \hat {\Psi }_{K}=\left({\sum \nolimits _{k=1}^{K}\hat {l}_{k}}\right)/\left({\sum \nolimits _{p=1}^{P}\hat {l}_{p}}\right)\tag{34}\end{equation*} View SourceRight-click on figure for MathML and additional features. where \hat {l}_{k} is the k -th estimated eigenvalue of \boldsymbol {A} in descending order of their magnitudes. It can be shown that \hat {\Psi }_{k} is asymptotically normal with mean \Psi _{K}=\left({\sum \nolimits _{k=1}^{K}l_{k}}\right)/\left({\sum \nolimits _{p=1}^{P}l_{p}}\right) and variance \begin{equation*} \tau _{K}^{2}=\frac {2\text {Tr}({\boldsymbol {A}}^{2})}{(N-1)(\text {Tr}(\boldsymbol{D}))^{2}}[\Psi _{K}^{2}-2\alpha \Psi _{K}+\alpha]\tag{35}\end{equation*} View SourceRight-click on figure for MathML and additional features. where \alpha =\left({\sum \nolimits _{k=1}^{K}l_{k}^{2}}\right)/\left({\sum \nolimits _{p=1}^{P}l_{p}^{2}}\right) and N denotes the effective sample number, say N=1/(1-\lambda [n]) . Hence (35) may be used to derive the approximate confidence intervals for \Psi _{K} . Consequently, we can determine the minimum K=1,\ldots,P such that the lower limit of the confidence interval \begin{equation*} \left [{\hat {\Psi }_{K}-\chi \sqrt {\tau _{K}^{2}/N},\hat {\Psi }_{K}+\chi \sqrt {\tau _{K}^{2}/N}}\right]\tag{36}\end{equation*} View SourceRight-click on figure for MathML and additional features. contains a specified proportion of the variance. For instance, the constant \chi can be selected as 1.96 for 95% confidence.

2) Automatic Model Estimation of B and C Using Scad Regularization

For the matrices \boldsymbol {B} and \boldsymbol {C} , it is more conveniently to employ robust regularization to select automatically the model order online. Regularization is also valuable to mitigate possible ill-conditioning problem when the input is not persistently exciting when the input signal level is very low. Consequently, the covariance matrix {\boldsymbol {R}}_{\Phi \Phi } may be ill-conditioned and it will result in a large estimation error variance or even divergence. To address this problem, we propose to utilize the SCAD regularization technique to avoid possible ill-conditioning and perform automatic model selection by shrinking small coefficients in \boldsymbol {B} and \boldsymbol {D} to zero. This will also reduce the variance of the estimates. Unlike L1 regularization, which is biased, robust regularization technique like SCAD is unbiased [27], [44]. The SCAD regularization can be imposed by adding the term \sum \nolimits _{i=1}^{L}f_\mu (\theta _{i}) to the LS objective function of the RLS algorithm \begin{equation*} E_{SC}(\theta)=\sum \nolimits _{i=1}^{n}\lambda _{n,i}(y[i]-\Phi ^{\mathsf {H}}[i]\theta)^{2}+\sum \nolimits _{i=1}^{L}f_\mu (\theta _{i}),\tag{37}\end{equation*} View SourceRight-click on figure for MathML and additional features. where \lambda _{n,i}=\lambda [n]\lambda _{n-1,i} , \lambda _{n,n}=1 , L=M(N+p) , \begin{align*} f_\mu (\theta)\! = \!\begin{cases} \displaystyle {\mu \vert \theta \vert,} & {\text {for}\quad \vert \theta \vert \leq \mu } \\ \displaystyle {-\frac {(\vert \theta \vert -\tilde {a}\mu)^{2}}{2(\tilde {a}-1)}\!+\!\frac {(\tilde {a}+1)\mu ^{2}}{2},} & {\text {for}\quad \mu < \vert \theta \vert \leq \tilde {a}\mu } \\ \displaystyle {(\tilde {a}+1)\mu ^{2}/2,} & {\text {for}\quad \vert \theta \vert >\tilde {a}\mu } \end{cases}\!\!\!\!\!\!\!\!\!\!\!\! \\{}\tag{38}\end{align*} View SourceRight-click on figure for MathML and additional features. with \mu is a non-negative regularization parameter and \tilde {a}>2 (which is usually chosen as 3.7). It was suggested in [44] that the variable regularization parameter can be chosen as \begin{equation*} \mu [n]=(1-\tilde {\lambda }[n])L\bar {\sigma }_\phi ^{2}[n]\sqrt {\tilde {\sigma }_\Sigma ^{2}[n]/(\text {Tr}({\boldsymbol {R}}_{\Phi \Phi })+\epsilon)M_{1}},\tag{39}\end{equation*} View SourceRight-click on figure for MathML and additional features. where \hat {\sigma }_\phi ^{2}[n] is a long-term estimate of the input signal power, which can be estimated as \hat {\sigma }_\phi ^{2}[n]=(1/n)\sum \nolimits _{i=1}^{n}\text {Tr}({\boldsymbol {R}}_{\Phi \Phi }[i]) . \epsilon is a small positive constant included to avoid ill-conditioning or division by zero when \text {Tr}({\boldsymbol {R}}_{\Phi \Phi }[n]) is small. M_{1} is the number of nonzero coefficients.

By setting its derivative of (37) with respect to \theta to zero, we can obtain the equation for solving for the regularized parameter \begin{align*} (\tilde {\lambda }[n]{\boldsymbol {R}}_{\Phi \Phi }[n-1]+\phi [n]\phi ^{\mathsf {H}}[n]+{\boldsymbol {K}}[n]){{\theta}}_{opt}[n]={\boldsymbol {r}}_{\Phi Y}[n] \\{}\tag{40}\end{align*} View SourceRight-click on figure for MathML and additional features. where \begin{align*} {\boldsymbol {K}}[n]=&\text {diag}(\kappa _{1}[n],\ldots,\kappa _{L}[n]), \tag{41a}\\ \kappa _{i}[n]=&f_\mu '(\theta _{i}[n-1])/\theta _{i}[n-1], \tag{41b}\\ f_\mu '(\theta)=&\begin{cases} \displaystyle {\text {sgn}(\theta)\mu,} & {\text {for}\quad \vert \theta \vert \leq \mu } \\ \displaystyle {\frac {(-\theta +\text {sgn}(\theta)\tilde {a}\mu }{\tilde {a}-1,}} & {\text {for}\quad \mu < \vert \theta \vert \leq \tilde {a}\mu } \\ \displaystyle {0,} & {\text {for}\quad \vert \theta \vert >\tilde {a}\mu }, \end{cases}\tag{41c}\end{align*} View SourceRight-click on figure for MathML and additional features. and \text {sgn}(\theta) represents the sign function.

We note that the desired solution \theta in (40) can be efficiently estimated using the QRD structure. The update of {\boldsymbol {R}}_{\Phi \Phi }[n] involves two terms \phi [n]\phi ^{\mathsf {H}}[n] and {\boldsymbol {K}}[n] . The first one term is a rank-one change, which can be achieved using conventional QRD. The second term is of full rank and hence its exact realization will require \mathcal {O}(L^{3}) complexity. To reduce the complexity, we only update a row vector of L{\boldsymbol {K}}_{m}[n] either randomly or sequentially with {\boldsymbol {K}}_{m}[n] being the m -th row of {\boldsymbol {K}}[n] and n mod L = m . Here, mod denotes the modulo operation. Consequently, the QRD operation is executed once for the data vector [\phi ^{\mathsf {H}}[n],{\boldsymbol {y}}[n]] and once for the regularization term [\sqrt {L{\boldsymbol {K}}_{m}[n]},\boldsymbol{0}] at each time instant. This yields the proposed SCAD regularized algorithm, as shown in Table 3.

3) Arithmetic Complexity Comparison

The arithmetic complexities of various algorithms are compared in Table 4. Here, notation convenience, we shall use ‘ECF’ and ‘ELF’ to denote the SREIV-PAST algorithm with constant and local optimal FF, respectively. ‘ICF’ and ‘ILF’ are used to denote respectively the recursive algorithm for estimating the input-related matrices \boldsymbol {B} and \boldsymbol {D} with constant FF and local optimal FF. Finally, ‘PI’ and ‘SC’ denote respectively the past-input instrumental variable and SCAD regularization schemes.

TABLE 4 Complexity Comparison
Table 4- 
Complexity Comparison

The complexities of the SREIV-PAST and multivariate RLS algorithms are respectively \mathcal {O}(r_{1}^{3})+10r_{1}^{2}+(4L_{1}+8N_{1}+9)r_{1} and \mathcal {O}(r_{2}^{2})+N_{2}r_{2} , where N_{1}=p\alpha , r_{1}=K , N_{2}=(p+N)M , r_{2}=p and L_{1} denotes the dimension of the instrumental variable. The complexity of the PI scheme [34] is \mathcal {O}(M^{2})+4N_{1}M . The proposed LOFF for SREIV-PAST and multivariate RLS will require additional (4N_{1}+6L_{1})r_{1}^{2}+(2N_{1}-2)r_{1} and 3N_{2}r_{2}+3N_{2}+7r_{2} operations, respectively. The SCAD regularization needs N_{2}r_{2}+N_{2} updates per time instance. Hence, the total arithmetic complexity of the proposed SC-ELF-ICL-RSI algorithm is \mathcal {O}(r_{1}^{3})+(4N_{1}+6L_{1}+10)r_{1}^{2}+(4L_{1}+10N_{1}+7)r_{1}+\mathcal {O}(r_{2}^{2})+5N_{2}r_{2}+4N_{2}+7r_{2} . It can be noticed that the proposed algorithm has a comparable arithmetic complexity with those algorithms using a constant FF.

SECTION IV.

Numerical Example

We now evaluate the performance of the proposed algorithms by comparing them with other conventional approaches using simulated experimental and real measurement data. Both stationary and dynamic environments are investigated. The SISO and MIMO systems tested are widely used in the literature for comparison [33], [34], [42]. For simplicity, we follow the notations for the proposed algorithms and other compared approaches in section III-3 and Table 4.

A. A SISO Time-Invariant System

We first present simulation results on the following SISO time invariant system studied in [33]:\begin{equation*} y[n]=\frac {q^{-1}-0.5q^{-2}}{1-1.5q^{-1}+0.7q^{-2}}u[n]+\frac {1}{1-0.8q^{-1}}e[n],\tag{42}\end{equation*} View SourceRight-click on figure for MathML and additional features. where the numerator and denominator coefficient vectors are respectively and a=\begin{bmatrix} 1 &\,\, -1.5 &\,\, 0.7 \end{bmatrix} , and e[n] is an additive white Gaussian noise. The corresponding state-space model can be expressed as \begin{align*} {\boldsymbol {x}}[n+1]= \begin{bmatrix} 1.5 & -0.7 \\ 1 & 0 \\ \end{bmatrix}{\boldsymbol {x}}[n]+ \begin{bmatrix} 1 \\ 0 \\ \end{bmatrix} u[n], \tag{43a}\\ y[n]= \begin{bmatrix} 1 & -0.5 \\ \end{bmatrix}{\boldsymbol {x}}[n]+ \begin{bmatrix} 0 \\ \end{bmatrix}u[n]+v[n]\tag{43b}\end{align*} View SourceRight-click on figure for MathML and additional features. The input u[n] is taken as a Gaussian sequence with zero mean and unit variance. The proposed methods are compared with the RSI [33] and PI-SC-ELF-ILF-RSI algorithms at 17dB SNR. The constant forgetting factor for the ECF-based algorithms is set to 0.99. The minimum and maximum range of the forgetting factors of the proposed ELF-ICF-RSI and ECF-ILF-RSI algorithms are \lambda _{min}=0.97 and \lambda _{max}=0.999 , respectively, while the short term and long term forgetting factors are set to \lambda _{S}=0.9 and \lambda _{L}=0.99 , respectively. The order of the state space model is chosen as n=2 . All the results are obtained by averaging 100 independent trial runs.

Since the model order is unknown, we use our proposed approach to estimate the model order compared with the well-known criteria such as AIC and MDL. Different model orders of 2, 3 and 4 are used, and the result is shown in Table 5. It is shown that our proposed approach can give a consistent and accurate estimate over the AIC and MDL criteria for the tested systems.

TABLE 5 Model Order Estimation
Table 5- 
Model Order Estimation

The comparison of the proposed algorithms with other methods is shown in Fig. 2 and Fig. 3. It can be seen that the proposed ELF-based algorithms converge considerably faster than the RSI algorithm with a fixed FF for denominator coefficients estimation. The use of SCAD regularization also improves slightly the steady-state accuracy. The change of forgetting factor in the ELF-based algorithms is illustrated in Fig. 4. It can be seen the value of the FF gradually increases to a large value which reduces the steady state MSE. For estimating the numerator coefficients, the ILF-based algorithms converge much faster than the corresponding constant FF-based algorithms. It should be noted that the proposed variable FF can also speed up the convergence performance for PI-based algorithms. The change of the forgetting factor in ILF-based algorithms is shown in Fig. 5. The small FF utilized in the initial stage can achieve faster convergence speed while the large FF used at the steady state yields a smaller MSE.

Next, we compare the steady state performance of the ECF-ICF-RSI, ELF-ICF-RSI, ECF-ILF-RSI and SC-ELF-ILF-RSI algorithms in stationary environment. The SNR is kept at 17dB. The Bode diagrams of different algorithms at the 1000-th time-point are shown in Fig. 6. It can be seen that the SC-ELF-ILF-RSI algorithm gives the smaller error from the ground-truth. The ELF-based algorithms are more accurate due to the more accurate estimation of the extended observability matrix \Gamma .

FIGURE 6. - Bode diagrams of different algorithms at 1000-th time-point.
FIGURE 6.

Bode diagrams of different algorithms at 1000-th time-point.

B. SISO Systems With Time-Varying A

Consider the following time-varying system studied in [15] \begin{align*} {\boldsymbol {A}}[n]= \begin{bmatrix} 1.5+0.05\dfrac {e^{-n/1000}-1}{e^{-1}-1} & -0.7 \\ 1 & 0-0.03\dfrac {e^{-n/1000}-1}{e^{-1}-1} \\ \end{bmatrix}\! \\{}\tag{44}\end{align*} View SourceRight-click on figure for MathML and additional features. All the other settings are identical to the last example. The following colored measurement noise was added to the noise-free output of the system \begin{equation*} v[n]=\frac {1}{1-0.8q^{-1}}e[n],\tag{45}\end{equation*} View SourceRight-click on figure for MathML and additional features. where e[n] is a zero mean white Gaussian noise. The output SNR was set to 25~dB .

The criterion chosen to measure and to compare the performance was subspace angle [45]–​[47] in degree between \Gamma and \hat {\Gamma } as \begin{align*} \cos (\theta _{n})=&\max \limits _{{\boldsymbol {u}}\in \Gamma }\max \limits _{{\boldsymbol {v}}\in \hat {\Gamma }}{\boldsymbol {u}}^{\mathsf {T}}{\boldsymbol {v}}={\boldsymbol {u}}_{n}^{\mathsf {T}}{\boldsymbol {v}}_{n}, \\[-2pt]&\text {s.t.} ~\Vert {\boldsymbol {u}}\Vert _{2}^{2}=\Vert {\boldsymbol {v}}\Vert _{2}^{2}=1,\quad {\boldsymbol {u}}_{i}^{\mathsf {T}}{\boldsymbol {u}}_{i}=0, \\[-2pt]&\hphantom {\text {s.t.} ~}{\boldsymbol {v}}_{i}^{\mathsf {T}}{\boldsymbol {v}}_{i}=0,\quad i=1,2,\ldots,n-1\tag{46}\end{align*} View SourceRight-click on figure for MathML and additional features. The motivation of the criterion is that the largest subspace angle is related to the distance of equidimensional subspaces. Fig. 7 compares the subspace angle of different algorithms in nonstationary environment. It can be seen that the SC-ELF-ILF-RSI algorithm achieves a faster initial convergence speed and similar steady state error as the ECF-ICF-RSI algorithm. The ELF-ICF-RSI algorithm performs similar to the SC-ELF-ILF-RSI algorithm as a result of the variable forgetting factor employed to calculate the extended observability matrix \Gamma . The ECF-ILF-RSI algorithm converges a little slower than the SC-ELF-ILF-RSI algorithm, which is still faster than the ECF-ICF-RSI algorithm due to the variable forgetting factor used to compute the impulse response matrix \Phi from input to output.

FIGURE 7. - Estimated subspace angle using ECF-ICF-RSI in [22], and proposed ECF-ILF-RSI, ELF-ICF-RSI and SC-ELF-ILF-RSI algorithms under 25 dB SNR in nonstationary environment.
FIGURE 7.

Estimated subspace angle using ECF-ICF-RSI in [22], and proposed ECF-ILF-RSI, ELF-ICF-RSI and SC-ELF-ILF-RSI algorithms under 25 dB SNR in nonstationary environment.

To better illustrate the advantage of the proposed VFF algorithms, we also consider a system with poles closer to the origin similar to the last system:\begin{equation*} {\boldsymbol {A}}[n]\!=\! \begin{bmatrix} 1\!+\!0.1\dfrac {e^{-n/1000}-1}{e^{-1}-1} & -0.7 \\ 1 & 0-0.05\dfrac {e^{-n/1000}\!-\!1}{e^{-1}-1} \\ \end{bmatrix}\tag{47}\end{equation*} View SourceRight-click on figure for MathML and additional features. Other settings are identical to the previous simulation. The subspace angles of different algorithms are shown in Fig.8. It can be seen that the SC-ELF-ILF-RSI and ELF-ICF-RSI algorithms perform significantly better than the ECF-ILF-RSI and ECF-ICF-RSI algorithms. Next, we compare the tracking capability of the proposed algorithm with other methods in Fig. 9. It can be seen that SC-ELF-ILF-RSI algorithm can track the varying coefficients with the fastest speed. The ELF-ICF-RSI algorithm performs comparably with the ELF-ILF-RSI algorithm, while both of them converge faster than the ECF-ILF-RSI and ECF-ICF-RSI methods.

FIGURE 8. - Estimated subspace angle using ECF-ICF-RSI in [22], and proposed ECF-ILF-RSI, ELF-ICF-RSI and SC-ELF-ILF-RSI algorithms under 25 dB SNR in nonstationary environment with poles closer to the origin.
FIGURE 8.

Estimated subspace angle using ECF-ICF-RSI in [22], and proposed ECF-ILF-RSI, ELF-ICF-RSI and SC-ELF-ILF-RSI algorithms under 25 dB SNR in nonstationary environment with poles closer to the origin.

FIGURE 9. - Estimated denominator coefficients using ECF-ICF-RSI in [22], and proposed ECF-ILF-RSI, ELF-ICF-RSI, ELF-ILF-RSI and SC-ELF-ILF-RSI algorithms under 25 dB SNR in nonstationary environment.
FIGURE 9.

Estimated denominator coefficients using ECF-ICF-RSI in [22], and proposed ECF-ILF-RSI, ELF-ICF-RSI, ELF-ILF-RSI and SC-ELF-ILF-RSI algorithms under 25 dB SNR in nonstationary environment.

C. A SISO System With Time-Varying B

Next, we consider a system similar to the last time-varying system with \begin{equation*} {\boldsymbol {B}}[n]\!=\! \begin{bmatrix} 1+0.6\dfrac {e^{-n/1000}-1}{e^{-1}-1} & 0-0.2\dfrac {e^{-n/1000}-1}{e^{-1}-1}. \\ \end{bmatrix}\tag{48}\end{equation*} View SourceRight-click on figure for MathML and additional features. All the other settings are identical to the last example while \boldsymbol {A} is stationary as the first example. The tracking performances of the numerator coefficients of various algorithms are shown in Fig. 10. Both ELF-ICF-RSI and ECF-ICF-RSI algorithms are capable of tracking the fast changing parameters, but with a slower tracking speed than the ILF-based algorithms in which local optimal FF is employed to estimate \boldsymbol {B} and \boldsymbol {D} .

FIGURE 10. - Estimated numerator coefficients using ECF-ICF-RSI in [22], and proposed ECF-ILF-RSI, ELF-ICF-RSI, ELF-ILF-RSI and SC-ELF-ILF-RSI algorithms under 25 dB SNR in nonstationary environment.
FIGURE 10.

Estimated numerator coefficients using ECF-ICF-RSI in [22], and proposed ECF-ILF-RSI, ELF-ICF-RSI, ELF-ILF-RSI and SC-ELF-ILF-RSI algorithms under 25 dB SNR in nonstationary environment.

D. A MIMO Time-Invariant System

Here, we consider an N=4 MIMO time-invariant system studied in [48] \begin{align*} {\boldsymbol{A}}=&\begin{bmatrix} 0.7 &\, 0.7 &\, 0 &\, 0 \\ -0.7 &\, 0.7 &\, 0 &\, 0 \\ 0 &\, 0 &\, -0.7 &\, -0.7 \\ 0 &\, 0 &\, 0.7 &\, -0.7 \\ \end{bmatrix},\quad {\boldsymbol {B}}\!=\!\begin{bmatrix} 1.1650 &\, -0.6965 \\ 0.6268 &\, 1.6961 \\ 0.0751 &\, 0.0591 \\ 0.3561 &\, 1.7971 \\ \end{bmatrix} \\ \tag{49a}\\ {\boldsymbol{C}}=&\begin{bmatrix} 0.2641 & -1.4462 & 1.2460 & 0.5774 \\ 0.8717 & -0.7012 & -0.6390 & -0.3600 \\ \end{bmatrix}, \tag{49b}\\ {\boldsymbol{D}}=&\begin{bmatrix} -0.1356 & -1.2704 \\ -1.3493 & 0.9846 \\ \end{bmatrix}\tag{49c}\end{align*} View SourceRight-click on figure for MathML and additional features. The measurement and process noises are respectively {\boldsymbol {K}}{\boldsymbol {v}}[n] and {\boldsymbol {v}}[n] , where \begin{align*} {\boldsymbol{K}}=&\begin{bmatrix} 0.4968 & -0.3580 \\ -0.3312 & -0.0512 \\ 0.1560 & -0.3872 \\ -0.0900 & 0.5836 \\ \end{bmatrix}, \\ {\boldsymbol {R}}_{v}=&\begin{bmatrix} 0.0176 & -0.0267 \\ -0.0267 & 0.0497 \\ \end{bmatrix}.\tag{50}\end{align*} View SourceRight-click on figure for MathML and additional features. {\boldsymbol {v}}[n] is a zero mean white Gaussian noise with covarianc {\boldsymbol {R}}_{v} . Fig. 11 and Fig. 12 show the trajectories of the real and imaginary parts of the estimated poles using various algorithms, respectively. It can be seen from Fig. 11 that the ELF-ILF-RSI based algorithms converge much faster than the ECF-ICF-RSI algorithm in estimating the real part of the poles. The imaginary parts of the poles estimated also show the same characteristic in Fig. 12. The ELF-ILF-RSI algorithm performs comparably with the SC-ELF-ILF-RSI algorithm in terms of convergence speed.

FIGURE 11. - Estimated real part of poles using ECF-ICF-RSI in [22], and proposed ELF-ICF-RSI and SC-ELF-ILF-RSI algorithms in a MIMO system.
FIGURE 11.

Estimated real part of poles using ECF-ICF-RSI in [22], and proposed ELF-ICF-RSI and SC-ELF-ILF-RSI algorithms in a MIMO system.

FIGURE 12. - Estimated imaginary part of poles using ECF-ICF-RSI in [22], and proposed ELF-ICF-RSI and SC-ELF-ILF-RSI algorithms in a MIMO system.
FIGURE 12.

Estimated imaginary part of poles using ECF-ICF-RSI in [22], and proposed ELF-ICF-RSI and SC-ELF-ILF-RSI algorithms in a MIMO system.

Next, we compare the accuracy of the estimated poles obtained by the ECF-ICF-RSI, ELF-ILF-RSI and SC-ELF-ILF-RSI algorithms. The eigenvalues of the estimated \boldsymbol {A} matrices in 100 Monte-Carlo simulations are plotted and compared with the true ones. 300 samples are used for each realization. The estimated poles obtained by the ECF-ICF-RSI, ELF-ILF-RSI and SC-ELF-ILF-RSI algorithms are presented respectively in Fig. 13, Fig. 14 and Fig. 15. It can be noted that the mean of the eigenvalue estimates are equally accurate for these algorithms. However, the variance of the eigenvalue estimates of the ECF-ICF-RSI algorithm is slightly larger than the ELF-ILF-RSI algorithm. The eigenvalue estimates obtained by the SC-ELF-ILF-RSI algorithm are more close to the true ones due to the effect of SCAD regularization.

FIGURE 13. - Estimated poles using ECF-ICF-RSI algorithm in [22] for a Monte Carlo simulation of size 100.
FIGURE 13.

Estimated poles using ECF-ICF-RSI algorithm in [22] for a Monte Carlo simulation of size 100.

FIGURE 14. - Estimated poles using ELF-ILF-RSI algorithm for a Monte Carlo simulation of size 100.
FIGURE 14.

Estimated poles using ELF-ILF-RSI algorithm for a Monte Carlo simulation of size 100.

FIGURE 15. - Estimated poles using SC-ELF-ILF-RSI algorithm for a Monte Carlo simulation of size 100.
FIGURE 15.

Estimated poles using SC-ELF-ILF-RSI algorithm for a Monte Carlo simulation of size 100.

E. Application to a Wing Flutter Data

In this section, the proposed algorithms are tested using the wing flutter data [49] from the DaISy (Database for the Identification of Systems). The data consists of 1024 time points with single input and single output. The input signal is highly colored and the response is acquired via an accelerometer sensor in real time. The mathematical model of the mechanical system is unknown. Therefore, we use our proposed approach and the model order estimated is equal to 5. We tested different numbers of block rows with \alpha =7,8,9,10 and the result is summarized in Table 6. The following commonly used FIT criterion [50], [51] is used to compare the performance of various algorithms \begin{equation*} \text {FIT}=100\times \left ({1-\frac {\Vert y-\hat {y}\Vert }{\Vert y-\bar {y}\Vert }}\right),\tag{51}\end{equation*} View SourceRight-click on figure for MathML and additional features. where y is the measured system output, \hat {y} and \bar {y} denote the estimated output of the system and the arithmetic mean of y , respectively. Note that \Vert y-\hat {y}\Vert =\sqrt {\text {SSE}} and \Vert y-\bar {y}\Vert =\sqrt {\text {SST}} , where \begin{equation*} \text {SSE}=\sum \nolimits _{i=1}^{N}(y_{i}-\hat {y}_{i})^{2},\tag{52}\end{equation*} View SourceRight-click on figure for MathML and additional features. is the sum of squared errors and \begin{equation*} \text {SST}=\sum \nolimits _{i=1}^{N}(y_{i}-\bar {y}_{i})^{2},\tag{53}\end{equation*} View SourceRight-click on figure for MathML and additional features. is the total sum of squares deviation or total variation.

TABLE 6 Numerical Comparison for the Flutter System
Table 6- 
Numerical Comparison for the Flutter System

It can be seen that \alpha =8 gives the best results for all algorithms. The proposed ELF-ILF-RSI and SC-ELF-ILF-RSI algorithms can achieve high FIT values and outperform other compared methods. The accuracy of ELF-ICF-RSI algorithm is slightly better than the ECF-ILF-RSI algorithm. The prediction accuracy obtained with the ECF-ICF-RSI algorithm is much lower due to the constant forgetting factor.

SECTION V.

Conclusion

A new SCAD regularized LOFF-based algorithm for recursive subspace model identification has been presented. The VFF parameter is obtained by minimizing the MSE of the SREIV-PAST algorithm when the observability matrix is modeled by local polynomial modeling. A SCAD regularization term is also incorporated to reduce the estimation variance of the subspace during signal fading. Applications of the proposed algorithms under stationary and non-stationary environments show that the proposed algorithms offer considerably improved performance over the conventional and state-of-the-art algorithms.

Appendix

Mean Squares Deviation (MSD) of the Estimator

As mentioned, our aim is to find the FF so as to minimize the MSD of the estimator. Therefore, we need to evaluate the MSD of the estimator, which can be written as\begin{equation*} J_{MSD}({\boldsymbol {w}}_{k})=\mathbb{E}\left [{\Vert {\boldsymbol {w}}_{k}[n]-{\boldsymbol {h}}_{k}[n]\Vert _{2}^{2}}\right]. \tag{A.1}\end{equation*} View SourceRight-click on figure for MathML and additional features. To this end, we first rewrite the estimation error as \begin{equation*} {\boldsymbol {w}}_{k}[n]-{\boldsymbol {h}}_{k}[n]={\boldsymbol {b}}_{k}[n]+{\boldsymbol {v}}_{k}[n] \tag{A.2}\end{equation*} View SourceRight-click on figure for MathML and additional features. where {\boldsymbol {b}}_{k}[n]=\mathbb{E}\left [{{\boldsymbol {w}}_{k}[n]}\right]-{\boldsymbol {h}}_{k}[n] is the bias and {\boldsymbol {v}}_{k}[n]={\boldsymbol {w}}_{k}[n]-\mathbb{E}\left [{{\boldsymbol {w}}_{k}[n]}\right] is associated with the variance. Consequently \begin{equation*} J_{MSD}({\boldsymbol {w}}_{k})=\mathbb{E}\left [{\Vert {\boldsymbol {b}}_{k}[n]\Vert _{2}^{2}}\right]+\mathbb{E}\left [{\Vert {\boldsymbol {v}}_{k}[n]\Vert _{2}^{2}}\right]. \tag{A.3}\end{equation*} View SourceRight-click on figure for MathML and additional features. We now evaluate \mathbb{E}\left [{{\boldsymbol {w}}_{k}[n]}\right] and hence {\boldsymbol {b}}_{k}[n] and {\boldsymbol {v}}_{k}[n] from which the right hand side of (A.3) can be evaluated. From (25), we get the expected value of the estimate as \begin{equation*} \mathbb{E}\left [{{\boldsymbol {w}}_{k}[n]}\right]={\boldsymbol {h}}_{k}[n]+{\boldsymbol {B}}[n]\bar {{\boldsymbol {h}}}_{k}^{(1)}[n], \tag{A.4}\end{equation*} View SourceRight-click on figure for MathML and additional features. where \bar {{\boldsymbol {h}}}_{k}^{(1)}[n]=\mathbb{E}\left [{{\boldsymbol {h}}_{k}^{(1)}[n]}\right] . Moreover, following the technique in [52], we have ({\boldsymbol {C}}_{\Xi G}^{\mathsf {H}}[n]{\boldsymbol {C}}_{\Xi G}[n])^{-1}\mathop \approx \limits ^{n \to \infty }(1-\lambda [n])^{2}({\boldsymbol {C}}_{\vphantom {D_{j}}\xi g}^{\mathsf {H}}{\boldsymbol {C}}_{\xi g})^{-1} where {\boldsymbol {C}}_{\Xi G}[n]\mathop \approx \limits ^{n \to \infty }(1-\lambda [n])^{-1}{\boldsymbol {C}}_{\xi g} and {\boldsymbol {C}}_{\vphantom {D_{j}}\xi g}=\mathbb{E}\left [{{{\xi}}[n]{\boldsymbol {g}}^{\mathsf {H}}[n]}\right] . For simplicity, let us define {\boldsymbol {R}}_\tau [n]= {\boldsymbol {G}}^{\mathsf {H}}[n]{\boldsymbol {D}}_\lambda [n]{{\Xi}}[n]{{\Xi}}^{\mathsf {H}}[n]{\boldsymbol {D}}_\lambda [n]{\boldsymbol {D}}_\tau [n]{\boldsymbol {G}}[n] . As n \to \infty , \begin{align*} {\boldsymbol {R}}_\tau [n]\mathop = \limits ^{n \to \infty }&{\boldsymbol {G}}^{\mathsf {H}}[n]{\boldsymbol {D}}_\lambda [n]{{\Xi}}[n]{{\Xi}}^{\mathsf {H}}[n]{\boldsymbol {D}}_\lambda [n]{\boldsymbol {D}}_\tau [n]{\boldsymbol {G}}[n] \\\mathop =\limits ^{n \to \infty }&-(1\!-\!\lambda [n])^{-1}{\boldsymbol {C}}_{\xi h}^{\mathsf {H}}\sum \nolimits _{i=1}^{n}\lambda ^{n-i}[n]{{\xi}}[i]{\boldsymbol {g}}^{\mathsf {H}}[i](n-i) \\\mathop \approx \limits ^{n \to \infty }&-(1-\lambda [n])^{-1}{\boldsymbol {C}}_{\xi h}^{\mathsf {H}}\sum \nolimits _{i=0}^{n-1} i\lambda ^{i}[n]{\boldsymbol {C}}_{yy} \\\mathop = \limits ^{n \to \infty }&-(1-\lambda [n])^{-1}\lambda [n] \\&\times \,\left ({\frac {1 -\lambda ^{n-1}[n]}{(1- \lambda [n])^{2}} - \frac {(n - 1){\lambda ^{n-1}}[n]}{1-\lambda [n]}}\right){\boldsymbol {C}}_{\xi g}^{\mathsf {H}}{\boldsymbol {C}}_{\xi g} \\\mathop \approx \limits ^{n \to \infty }&- \frac {\lambda [n]}{(1-\lambda [n])^{3}}{\boldsymbol {C}}_{\xi g}^{\mathsf {H}}{\boldsymbol {C}}_{\xi g}. \tag{A.5}\end{align*} View SourceRight-click on figure for MathML and additional features.

Using (A.5 and the above results for ({\boldsymbol {C}}_{\Xi G}^{\mathsf {H}}[n]{\boldsymbol {C}}_{\Xi G}[n])^{-1} , we have {\boldsymbol {B}}[n]=({\boldsymbol {C}}_{\Xi G}^{\mathsf {H}}[n]{\boldsymbol {C}}_{\Xi G}[n])^{-1}{\boldsymbol {R}}_\tau [n]\mathop \approx \limits ^{n \to \infty }-\frac {\lambda [n]}{1-\lambda [n]} and hence \begin{equation*} {\boldsymbol {b}}_{k}[n]={\boldsymbol {B}}[n]\bar {{\boldsymbol {h}}}_{k}^{(1)}\mathop = \limits ^{n \to \infty } -\frac {\lambda [n]}{1-\lambda [n]}\bar {{\boldsymbol {h}}}_{k}^{(1)}[n]. \tag{A.6}\end{equation*} View SourceRight-click on figure for MathML and additional features. Note that the FF \lambda [n] is slowly varying over time. However, it can be assumed to be constant inside the neighborhood B(t_{n}) , which is assumed to be the interval under consideration i=1,\ldots,n . From (A.7), we finally get \begin{equation*} \mathbb{E}\left [{\Vert {\boldsymbol {b}}_{k}[n]\Vert _{2}^{2}}\right]=\frac {\lambda ^{2}[n]}{(1-\lambda [n])^{2}}\zeta _{k}, \tag{A.7}\end{equation*} View SourceRight-click on figure for MathML and additional features. where \zeta _{k}=\bar {{\boldsymbol {h}}}_{k}^{(1)H}[n]\bar {{\boldsymbol {h}}}_{k}^{(1)}[n] . Similarly, we have \begin{equation*} {\boldsymbol {v}}_{k}[n]={\boldsymbol {C}}[n]({{\eta}}_{k}[n]+{{\varsigma}}_{k}[n])+{\boldsymbol {B}}[n]\delta {\boldsymbol {h}}_{k}^{(1)}[n], \tag{A.8}\end{equation*} View SourceRight-click on figure for MathML and additional features. where {\boldsymbol {h}}_{k}^{(1)}[n]\bar {{\boldsymbol {h}}}_{k}^{(1)}+\delta {\boldsymbol {h}}_{k}^{(1)}[n] and \begin{align*} \mathbb{E}\left [{\Vert {\boldsymbol {v}}_{k}[n]\Vert _{2}^{2}}\right]=&\mathbb{E}\left [{\text {Tr}({\boldsymbol {C}}^{\mathsf {H}}[n]{\boldsymbol {C}}[n]{{\eta}}_{k}[n]{{\eta}}_{k}^{\mathsf {H}}[n])}\right] \\&+\,\mathbb{E}\left [{\text {Tr}({\boldsymbol {C}}^{\mathsf {H}}[n]{\boldsymbol {C}}[n]{{\varsigma}}_{k}[n]{{\varsigma}}_{k}^{\mathsf {H}}[n])}\right] \\&+\,\mathbb{E}\left [{\text {Tr}({\boldsymbol {B}}[n]\delta {\boldsymbol {h}}_{k}^{(1)}\delta {\boldsymbol {h}}_{k}^{(1)H}[n]{\boldsymbol {B}}[n])}\right]\qquad \tag{A.9}\end{align*} View SourceRight-click on figure for MathML and additional features. Moreover, using assumptions (A1) to (A3), (A.9) can be simplified to: (A.9) can be simplified to \begin{align*}\mathbb{E}\left [{\Vert {\boldsymbol {v}}_{k}[n]\Vert _{2}^{2}}\right] &=\frac {K\sigma _{\delta h_{k}}^{2}[n]\lambda ^{2}[n]}{(1-\lambda [n])^{2}} \\&\!{{{+\,\frac {1-\lambda [n]}{1+\lambda [n]}(\sigma _{\eta _{k}}^{2}+\sigma _{r_{k}}^{2}\text {Tr}({\boldsymbol {C}}_{\xi \xi }))\text {Tr}(\Omega),}}} \tag{A.10}\end{align*} View SourceRight-click on figure for MathML and additional features. where \sigma _{\eta _{\vphantom {D_{j}}k}}^{2}=\frac {1}{K}\mathbb{E}\left [{\Vert {{\eta}}_{k}[n]\Vert _{2}^{2}}\right] , \sigma _{r_{k}}^{2}=\frac {1}{K}\mathbb{E}\left [{\Vert {\boldsymbol {r}}_{k}[n]\Vert _{2}^{2}}\right] , \sigma _{\delta h_{k}}^{2}=\frac {1}{K}\mathbb{E}\left [{\Vert \delta {\boldsymbol {h}}_{k}[n]\Vert _{2}^{2}}\right] and {{\Omega}}=({\boldsymbol {C}}_{\xi g}^{\mathsf {H}}{\boldsymbol {C}}_{\xi g})^{-1}{\boldsymbol {C}}_{\xi g}^{\mathsf {H}}{\boldsymbol {C}}_{\xi \xi }{\boldsymbol {C}}_{\xi g}\,\,({\boldsymbol {C}}_{\xi g}^{\mathsf {H}}{\boldsymbol {C}}_{\xi g})^{-H} . Here, we have also used the following:\begin{align*}&\hspace {-1.2pc}\mathbb{E}\left [{\text {Tr}({\boldsymbol {C}}^{\mathsf {H}}[n]{\boldsymbol {C}}[n])}\right] \\ {\stackrel{{\hspace {1.6pc}}}{{\vphantom{\mathop = \limits ^{n \to \infty }}}{=}}}&\text {Tr}(({\boldsymbol {C}}_{\Xi G}^{\mathsf {H}}[n]{\boldsymbol {C}}_{\Xi G}[n])^{-H}({\boldsymbol {C}}_{\Xi G}^{\mathsf {H}}[n]{\boldsymbol {C}}_{\Xi G})^{-1} \\&\cdot \,{\boldsymbol {G}}^{\mathsf {H}}[n]{\boldsymbol {D}}_\lambda [n]{{\Xi}}[n]{{\Xi}}^{\mathsf {H}}[n]{\boldsymbol {D}}_\lambda ^{2}[n]{{\Xi}}[n]{{\Xi}}^{\mathsf {H}}[n]{\boldsymbol {D}}_\lambda [n]{\boldsymbol {G}}[n]) \\\mathop = \limits ^{n \to \infty }&(1-\lambda [n])^{2}\text {Tr}(({\boldsymbol {C}}_{\xi g}^{\mathsf {H}}{\boldsymbol {C}}_{\xi g})^{-1}{\boldsymbol {C}}_{\xi g}^{\mathsf {H}}{{\Xi}}^{\mathsf {H}}[n]{\boldsymbol {D}}_\lambda ^{2}[n]{{\Xi}}[n] \\&\cdot \,{\boldsymbol {C}}_{\xi g}({\boldsymbol {C}}_{\xi g}^{\mathsf {H}}{\boldsymbol {C}}_{\xi g})^{-H}) \\\mathop = \limits ^{n \to \infty }&\frac {1-\lambda [n]}{1+\lambda [n]}\text {Tr}(({\boldsymbol {C}}_{\xi g}^{\mathsf {H}}{\boldsymbol {C}}_{\xi g})^{-1}{\boldsymbol {C}}_{\xi g}^{\mathsf {H}}{\boldsymbol {C}}_{\xi \xi }{\boldsymbol {C}}_{\xi g}({\boldsymbol {C}}_{\xi g}^{\mathsf {H}}{\boldsymbol {C}}_{\xi g})^{-H}), \\{}\tag{A.11}\end{align*} View SourceRight-click on figure for MathML and additional features. and the fact that {{\Xi}}^{\mathsf {H}}[n]{\boldsymbol {D}}_\lambda ^{2}[n]{{\Xi}}[n]\mathop \approx \limits ^{n \to \infty }(1-\lambda ^{2}[n])^{-1}{\boldsymbol {C}}_{\vphantom {D_{j}}\xi \xi } and \mathbb{E}\left [{\text {Tr}({\boldsymbol {B}}[n]{\boldsymbol {B}}^{\mathsf {H}}[n])}\right]=\text {Tr}(({\boldsymbol {C}}_{\Xi G}^{\mathsf {H}}[n]{\boldsymbol {C}}_{\Xi G}[n])^{-1}{\boldsymbol {R}}_\tau [n]{\boldsymbol {R}}_\tau ^{\mathsf {H}}[n]\cdot ({\boldsymbol {C}}_{\Xi G}^{\mathsf {H}}[n]{\boldsymbol {C}}_{\Xi G}[n])^{-H})\mathop \approx \limits ^{n \to \infty }\frac {K\lambda ^{2}[n]}{(1-\lambda [n])^{2}} .

Consequently, \begin{equation*} J_{MSD}({\boldsymbol {w}}_{k}[n])\approx \frac {1-\lambda [n]}{1+\lambda [n]}\sigma _{\Sigma _{k}}^{2}\text {Tr}(\Omega)+\frac {\lambda ^{2}[n]\sigma _{h_{k}}^{2}[n]}{(1-\lambda [n])^{2}}, \qquad \quad \tag{A.12}\end{equation*} View SourceRight-click on figure for MathML and additional features. where \sigma _{\Sigma _{k}}^{2}=\sigma _{\eta _{k}}^{2}+\sigma _{r_{k}}^{2}\text {Tr}({\boldsymbol {C}}_{\xi \xi }) , \sigma _{h_{k}}^{2}[n]=K\sigma _{\delta h_{k}}^{2}[n]+\zeta _{k} . Therefore, the total MSD is given by \begin{equation*} J_{MSD}[n]\approx \frac {1-\lambda [n]}{1+\lambda [n]}\sigma _{\Sigma }^{2}\text {Tr}(\Omega)+\frac {\lambda ^{2}[n]\sigma _{h}^{2}[n]}{(1-\lambda [n])^{2}}. \quad \tag{A.13}\end{equation*} View SourceRight-click on figure for MathML and additional features. where \sigma _\Sigma ^{2}=\sum \nolimits _{k=1}^{K} \sigma _{\Sigma _{k}}^{2} and \sigma _{h}^{2}[n]=\sum \nolimits _{k=1}^{K} \sigma _{h_{k}}^{2}[n] .

References

References is not available for this document.