Introduction
Subspace-based approaches play an important role in system identification [1]–[4] due to its theoretical and practical convenience for building state-space models directly from the input-output data of systems. Offline subspace identification techniques are characterized by the use of robust numerical tools such as the RQ factorization and the singular values decomposition (SVD) of the data matrix [5]. However, the batch subspace model identification (SMI) algorithms are computational intensive for online implementation due to the high computational complexity of SVD. Consequently, much research on recursive subspace model identification (RSI) [6]–[9] has been devoted to update the model parameters over time with a reduced computational cost. Most RSI algorithms estimate the unknown state matrix
In the context of recursive subspace tracking, the projection approximation subspace tracking (PAST) algorithm [14] has been shown to yield good performance and efficient implementation for tracking the subspace spanned by the observability matrix. The PAST algorithm is based on the use of the recursive least squares (RLS) algorithm and the projection approximation. It assumes that the subspace is slowly time-varying and a fixed forgetting factor (FF) is usually used. Therefore, its tracking performance in time-varying systems or sudden system changes may be degraded. Furthermore, a drawback of classical recursive SMI [15]–[17] algorithms is that the disturbances acting on the system output are required to be spatially and temporally white. Biased estimates will result if such assumption is violated. The instrumental variable (IV) method [18]–[21] is an effective approach to tackle these types of noise problems by exploring the cross-correlation between the instrumental variables with the input variables rather than the covariance of the inputs as in conventional least squares method. The extension of IV to the PAST algorithm, namely the extended IV PAST (EIV-PAST) method, has been proposed in [22] to address the problem of biased estimates in the RSI algorithm when the output disturbances are not spatially and temporally white.
It is known that the EIV-PAST algorithm is able to achieve a faster convergence speed and smaller mean square error (MSE) in slowly time-varying environment if a large FF is utilized. On the other hand, a relatively small FF should be used to facilitate the tracking of fast time-varying parameters. In the context of RLS algorithms, Thus, a number of variable FF-based (VFF) algorithms have been proposed [23]–[26] to achieve a satisfactory performance in stationary, slowly or fast time-varying environment. In particular, a local polynomial modeling (LPM)-based VFF-RLS algorithm is recently developed in [26] by minimizing the MSE and its performance compares favorably with other conventional VFF-RLS algorithms tested. Note that these algorithms are applicable to the case of single output and rely on the Wiener solution involving the covariance of the inputs. On the other hand, the EIV-PAST involves multiple outputs and a system equation involving the cross-correlation between the input signal and the instrumental variables. Therefore, it is highly desirable to develop new VFF algorithms for better adaptation of the EIV-PAST algorithm in time-varying and noisy environment. Furthermore, it is important to be able to incorporate variable regularization in the EIV-PAST algorithms to improve their numerical behavior at low input signal level.
In this paper, we propose a new RSI algorithm where the subspace of the extended observability matrix (and hence the state matrix
The proposed VFF-SREIV-PAST algorithm utilizes the LPM channel model [26] of time-varying channels so as to derive the MSE of the EIV linear model as a function of the forgetting factor. Moreover, the resultant MSE contains a bias term which increases with the FF and a variance term which decreases with the FF. Therefore, the asymptotic MSE can be minimized at each time instant to obtain the proposed locally optimized FF for improving the convergence speed and steady state error. Techniques for estimating the various quantities required will also be introduced. The resultant SREIV-PAST algorithm offers improved resilience to additive noise over the PAST algorithm and it is implemented using the more stable square-root form.
The proposed VFF multivariate RLS algorithm is an extension of the VFF-RLS algorithm for single output [26] to multiple outputs with SCAD regularization. The VFF speeds up the convergence of the multivariate RLS algorithm whilst the SCAD regularization helps to achieve automatic model selection and the reduction of the estimation variance arising from possibly ill-conditioned covariance matrix especially at low input signal level. This is very useful during signal fading where the covariance matrix may be significantly ill-conditioned. Efficient techniques for incorporating SCAD in the numerical more stable QR decomposition (QRD) and estimating other required quantities are also developed.
Different from the LPM-based VFF-RLS algorithm in [26], which deals with single output and autocovariance matrix in the Wiener filter, the proposed algorithms are applicable to complex inputs, single and multiple outputs, and EIV systems involving cross-correlation matrix. The proposed algorithms are evaluated using computer simulations under stationary and nonstationary environments and a real dataset on wing flutter data. Experimental results show that the proposed algorithm provides faster convergence performance and more accurate estimates than conventional algorithms tested. Moreover, it also provides the model order for the state transition matrix and automatic model order selection for the input and feedthrough matrices. The convergence of the VFF-SREIV-PAST and RLS algorithms are also analyzed.
In summary, the novelties of this paper are i) the development of a new recursive RSI system identification algorithm with automatic order estimation, ii) the algorithm is based on efficient LPM based VFF EIV-PAST and multi-variate RLS algorithms for improving the tracking speed and steady state error, and they can be realized using efficient square-root form based on hyperbolic rotations and conventional QR decomposition respectively with improved its numerical stability, iii) a recursive bi-iteration singular value decomposition (SVD) algorithm is proposed for extraction of singular values of the state transition matrix for automatic order selection, iv) a new criterion based on percentage of explained variance is proposed to facilitate online estimation of the model order of A, and v) the SCAD regularization is incorporated for automatic model selection of the input matrices B and D as it is asymptotically unbiased. The regularized EIV algorithm also helps to stabilize the subspace especially during signal fading.
The rest of the paper is organized as follows. Section II will briefly review the system model, problem formulation and related works. In Section III, the proposed algorithm is introduced. Numerical results and comparison with the conventional algorithms are presented in Section IV. It suggests that the proposed algorithm has a better performance than conventional and the state-of-the-art algorithms. Finally, Section V concludes the paper.
Problem Formulation
A. System Model
Consider the following \begin{align*} {\boldsymbol {x}}[n+1]=&{\boldsymbol {A}}{\boldsymbol {x}}[n]+{\boldsymbol {B}}{\boldsymbol {u}}[n],\tag{1a}\\ {\boldsymbol {y}}[n]=&{\boldsymbol {C}}{\boldsymbol {x}}[n]+{\boldsymbol {D}}{\boldsymbol {u}}[n]+{\boldsymbol {v}}[n]\tag{1b}\end{align*}
For notation convenience, let us define the following stacked \begin{align*} {\boldsymbol {u}}_\alpha=&\left [{{\boldsymbol {u}}^{\mathsf {T}}[n],\ldots,{\boldsymbol {u}}^{\mathsf {T}}[n+\alpha -1]}\right]^{\mathsf {T}},\tag{2a}\\ {\boldsymbol {y}}_\alpha=&\left [{{\boldsymbol {y}}^{\mathsf {T}}[n],\ldots,{\boldsymbol {y}}^{\mathsf {T}}[n+\alpha -1]}\right]^{\mathsf {T}},\tag{2b}\\ {\boldsymbol {v}}_\alpha=&\left [{{\boldsymbol {v}}^{\mathsf {T}}[n],\ldots,{\boldsymbol {v}}^{\mathsf {T}}[n+\alpha -1]}\right]^{\mathsf {T}},\tag{2c}\end{align*}
\begin{equation*} {\boldsymbol {y}}_\alpha [n]=\Gamma _\alpha {\boldsymbol {x}}[n]+\Phi _\alpha {\boldsymbol {u}}_\alpha [n]+{\boldsymbol {v}}_\alpha [n],\tag{3}\end{equation*}
\begin{equation*} \Gamma _\alpha =\left [{\Gamma _{0}^{\mathsf {T}}, \Gamma _{1}^{\mathsf {T}}, \ldots, \Gamma _{\alpha -1}^{\mathsf {T}}}\right]^{\mathsf {T}},\tag{4}\end{equation*}
\begin{equation*} \Phi _\alpha = \left [{ \begin{matrix} {\boldsymbol {D}}& {\boldsymbol {0}}& \cdots & {\boldsymbol {0}}\\ {\boldsymbol {C}}{\boldsymbol {B}}& {\boldsymbol {D}}& \cdots & {\boldsymbol {0}}\\ \vdots & \ddots & \ddots & \vdots \\ {\boldsymbol {C}}{\boldsymbol {A}}^{\alpha -2}{\boldsymbol {B}}& \cdots & {\boldsymbol {C}}{\boldsymbol {B}}& {\boldsymbol {D}}\end{matrix} }\right].\tag{5}\end{equation*}
B. Recursive Estimation of A
and C
One can notice from (4) that, the matrix \begin{equation*} \hat {{\boldsymbol {A}}}[n]=\Gamma _{a}^\dagger \Gamma _{b}\tag{6}\end{equation*}
Since, the state space matrix is uniquely identifiable up to an orthogonal transformation, it is sufficient to obtain a subspace that span the column space of \begin{equation*} {\boldsymbol {z}}[n]=\Gamma [\theta]{\boldsymbol {x}}[n]+{\boldsymbol {e}}[n],\tag{7}\end{equation*}
\begin{equation*} {\boldsymbol {z}}[n]=\Gamma _\alpha {\boldsymbol {x}}[n]+{\boldsymbol {v}}_\alpha [n],\tag{8}\end{equation*}
\begin{align*}&\left [{ \begin{matrix} {\boldsymbol {R}}_{11}[n+1] & {\boldsymbol {0}}& {\boldsymbol {0}}\\ {\boldsymbol {R}}_{21}[n+1] & {\boldsymbol {R}}_{22}[n+1] & {\boldsymbol {z}}[n] \\ \end{matrix} }\right] \\&\qquad \qquad \qquad \quad {{{={\boldsymbol {G}}[n] \left [{ \begin{matrix} {\boldsymbol {R}}_{11}[n] & {\boldsymbol {0}}& {\boldsymbol {u}}_\alpha [n] \\ {\boldsymbol {R}}_{21}[n] & {\boldsymbol {R}}_{22}[n] & {\boldsymbol {y}}_\alpha [n] \\ \end{matrix} }\right],} }}\tag{9}\end{align*}
In array signal processing and related applications, very efficient subspace tracking algorithms [14], [22], [30]–[32] have been developed to recursively update the column subspace of \begin{equation*} {\boldsymbol {z}}[n]={\boldsymbol {W}}[n]{\boldsymbol s}[n]+{{\eta}}[n],\tag{10}\end{equation*}
\begin{equation*} \mathbb {E}_{LS}({\boldsymbol {W}}[n])=\sum \nolimits _{m=1}^{n} \lambda ^{n-m} \Vert {\boldsymbol {z}}[m]-{\boldsymbol {W}}[n]{\boldsymbol {g}}[m]\Vert _{2}^{2}\tag{11}\end{equation*}
The use of the LS criterion in (11) implicitly assumes that the additive noise
Assuming that there exists such an IV vector
,\mathbb {E}[{\boldsymbol {v}}_\alpha [n]{{\xi}}^{\mathsf {H}}[n]]={\boldsymbol {0}} \text {Rank}({\boldsymbol {C}}_{x\xi })=\text {Rank}\{\mathbb {E}[{\boldsymbol {x}}[n]{{\xi}}^{\mathsf {H}}[n]]\}=N

It can be seen that assumption (C1) ensures that the IV vector \begin{align*} {\boldsymbol {C}}_{z\xi }=&\mathbb {E}[{\boldsymbol {z}}[n]{{\xi}}^{\mathsf {H}}[n]] \\=&{\boldsymbol {W}}\cdot \mathbb {E}[{\boldsymbol {g}}[n]{{\xi}}^{\mathsf {H}}[n]]+\mathbb {E}[{{\eta}}[n]{{\xi}}^{\mathsf {H}}[n]] \\=&{\boldsymbol {W}}\cdot {\boldsymbol {C}}_{h\xi },\tag{13}\end{align*}
\begin{equation*} ({\boldsymbol {C}}_{z\xi }{\boldsymbol {C}}_{g\xi }^{\mathsf {H}})={\boldsymbol {W}}\cdot ({\boldsymbol {C}}_{g\xi }{\boldsymbol {C}}_{g\xi }^{\mathsf {H}}).\tag{14}\end{equation*}
In practice, \begin{align*} {\boldsymbol {C}}_{Z\Xi }[n]=&{\boldsymbol {Z}}^{\mathsf {H}}[n]{\boldsymbol {D}}_\lambda [n]{{\Xi}}[n] \\=&\sum \nolimits _{i=1}^{t}\lambda ^{t-i}{\boldsymbol {z}}[i]{{\xi}}^{\mathsf {H}}[i] \\=&\lambda {\boldsymbol {C}}_{Z\Xi }[n-1]+{\boldsymbol {z}}[n]{{\xi}}^{\mathsf {H}}[n], \tag{15a}\\ {\boldsymbol {C}}_{\Xi G}[n]=&{\boldsymbol {C}}_{G\Xi }^{\mathsf {H}}[n] ={{\Xi}}^{\mathsf {H}}[n]{\boldsymbol {D}}_\lambda [n]{\boldsymbol {G}}[n] \\=&\sum \nolimits _{i=1}^{t}\lambda ^{t-i}{{\xi}}[i]{\boldsymbol {g}}^{\mathsf {H}}[i] \\=&\lambda {\boldsymbol {C}}_{\Xi G}[n-1]+{{\xi}}[n]{\boldsymbol {g}}^{\mathsf {H}}[n],\tag{15b}\end{align*}
For the simple IV method, the inverse of \begin{equation*} {\boldsymbol {W}}[n]={\boldsymbol {C}}_{Z\Xi }[n]{\boldsymbol {C}}_{\Xi G}^{-\mathsf {H}}[n].\tag{16}\end{equation*}
Similarly to (6), \begin{align*} \hat {{\boldsymbol {A}}}[n]=&{\boldsymbol {W}}_{1:p(\alpha -1)}^\dagger [n]{\boldsymbol {W}}_{(p+1):p\alpha }[n], \tag{17a}\\ \hat {{\boldsymbol {C}}}[n]=&{\boldsymbol {W}}_{1:p}[n]. \tag{17b}\end{align*}
For \begin{align*} {\boldsymbol {X}}_{B}^{(l)}=&{\boldsymbol {W}}_{1:p(\alpha -1)}{\boldsymbol {Q}}_{A}^{(l)} \\ {\boldsymbol {X}}_{B}^{(l)}=&{\boldsymbol {Q}}_{B}^{(l)}{\boldsymbol {R}}_{B}^{(l)} \qquad (\text {QR}\quad \text {factorization}) \\ {\boldsymbol {X}}_{A}^{(l)}=&{\boldsymbol {W}}_{1:p(\alpha -1)}^{\mathsf {H}}{\boldsymbol {Q}}_{B}^{(l)} \\ {\boldsymbol {X}}_{A}^{(l)}=&{\boldsymbol {Q}}_{A}^{(l+1)}{\boldsymbol {R}}_{A}^{(l)} \quad (\text {QR}\quad \text {factorization})\tag{18}\end{align*}
(17a) can be recursively solved with the help of (18) using the multivariate complex RLS algorithm [32]. In Section III-C, we shall discuss how the model order of
C. Recursive Estimation of B
and D
Given \begin{equation*} {\boldsymbol {y}}[n]=\Phi ^{\mathsf {H}}[n]{{\theta}}[n]+{\boldsymbol {v}}[n],\tag{19}\end{equation*}
\begin{align*} {{\theta}}[n]=&\left [{\text {vec}^{\mathsf {T}}(\boldsymbol{D}) \quad \text {vec}^{\mathsf {T}}(\boldsymbol{B})}\right]^{\mathsf {T}}, \tag{20a}\\ \Phi ^{\mathsf {H}}[n]=&\left[{({\boldsymbol {u}}^{\mathsf {H}}[n]\otimes {\boldsymbol {I}}_{p}) \quad \left({\sum \nolimits _{s=1}^{t-1}{\boldsymbol {u}}^{\mathsf {H}}[s]\otimes \hat {{\boldsymbol {C}}}\hat {{\boldsymbol {A}}}^{t-s-1}}\right)}\right].\tag{20b}\end{align*}
\begin{equation*} {\boldsymbol {R}}_{\Phi \Phi }[n]{{\theta}}_{opt}[n]={\boldsymbol {r}}_{\Phi Y}[n],\tag{21}\end{equation*}
In this work, we shall utilize the complex VFF-RLS algorithm with multiple outputs in [32] to estimate the matrix
The Proposed SC-VFF-SREIV and -RLS RSI Algorithm
A. The Proposed VFF-SREIV-PAST Algorithm
In the proposed VFF-SREIV-PAST algorithm for tracking the subspace of
1) Local Polynomial Modeling
From (11), we can observe that the \begin{equation*} {\boldsymbol {h}}_{k}(t_{m})={\boldsymbol {h}}_{k}(t_{n})+\frac {1}{1!}{\boldsymbol {h}}_{k}^{(1)}(t_{n})(t_{m}-t_{n})+{\boldsymbol {r}}_{k}(t_{m}-t_{n})\tag{22}\end{equation*}
Substituting (22) into (8) with \begin{align*} z_{k}^{*}[m]=&{\boldsymbol {h}}^{\mathsf {H}}[m]{\boldsymbol {h}}_{k}[m]+\eta _{k}[m] \\=&{\boldsymbol {h}}^{\mathsf {H}}[m]({\boldsymbol {h}}_{k}[n]\!+\!{\boldsymbol {h}}_{k}^{(1)}[n](m\!-\!n))\!+\!\eta _{k}[m]\!+\!\varsigma _{k}[m]\tag{23}\end{align*}
Stacking the observed signal samples \begin{align*} {\boldsymbol {z}}_{k}^{*}[n]={\boldsymbol {G}}[n]{\boldsymbol {h}}_{k}[n]+{\boldsymbol {D}}_\tau [n]{\boldsymbol {G}}[n]{\boldsymbol {h}}_{k}^{(1)}[n]+{{\eta}}_{k}[n]+{{\varsigma}}_{k}[n], \\{}\tag{24}\end{align*}
For simplicity of analysis, we assume that
,{\boldsymbol {r}}_{k}[m-n] for{{\eta}}_{k}[n] , andk=1,\ldots,K are independent.{\boldsymbol {G}}[n] and{\boldsymbol {r}}_{k}[m-n] for{{\eta}}_{k}[n] , are zero mean with covariance respectively given byk=1,\ldots,K and\mathbb{E}\left [{{{\eta}}_{k}[n]{{\eta}}_{k}^{\mathsf {H}}[n]}\right]=\sigma _{\eta _{k}}^{2}\delta [n-m]{\boldsymbol {I}} , where\mathbb{E}\left [{{\boldsymbol {r}}_{k}[n]{\boldsymbol {r}}_{k}^{\mathsf {H}}[n]}\right]=\sigma _{r_{k}}^{\vphantom {D^{'}}2}\delta [n-m]{\boldsymbol {I}} is the unit impulse function and\delta [n] is the identity matrix with appropriate dimension.\boldsymbol {I} has mean{\boldsymbol {h}}_{k}^{(1)}[n] and covariance\mathbb{E}\left [{{\boldsymbol {h}}_{k}^{(1)}[n]}\right]\bar {{\boldsymbol {h}}}_{k}^{(1)}[n] where\mathbb{E}\left [{\delta {\boldsymbol {h}}_{k}^{(1)}[n]\delta {\boldsymbol {h}}_{k}^{(1)H}[n]}\right]=\sigma _{h_{k}}^{2}[n]{\boldsymbol {I}} .{\boldsymbol {h}}_{k}^{(1)}[n]=\bar {{\boldsymbol {h}}}_{k}^{\vphantom {D^{'}}(1)}[n]+\delta {\boldsymbol {h}}_{k}^{(1)}[n]
For remarks regarding the motivation behind these assumptions, interested readers are referred to [32] and the supplementary materials for more details. From the simulation results to be presented in Section IV, we found that the proposed VFF-SREIV-PAST algorithm, which is derived from the above assumption, performs satisfactorily both in the synthetic as well as real data. This suggests that the above assumptions are reasonable.
To start with, we substituting (24) into (16) and obtain the following expression of the EIV solution under the LPM time-varying channel model:\begin{equation*} {\boldsymbol {w}}_{k}[b]={\boldsymbol {h}}_{k}[n]+{\boldsymbol {B}}[n]{\boldsymbol {h}}_{k}^{(1)}[n]+{\boldsymbol {C}}[n]({{\eta}}_{k}[n]+{{\varsigma}}_{k}[n])\tag{25}\end{equation*}
2) VFF Recursive EIV Algorithm for Estimating A
and C
The mean squares deviation (MSD) of the estimator \begin{equation*} J_{MSD}[n]\approx \frac {1-\lambda [n]}{1+\lambda [n]}\sigma _{\Sigma }^{2}\text {Tr}(\Omega)+\frac {\lambda ^{2}[n]\sigma _{h}^{2}[n]}{(1-\lambda [n])^{2}}.\tag{26}\end{equation*}
By minimizing \begin{equation*} \frac {\partial J_{MSD}[n]}{\partial \lambda [n]}=-\frac {2\sigma _\Sigma ^{2}\text {Tr}(\Omega)}{(1+\lambda [n])^{2}}+\frac {2\lambda [n]\sigma _{h}^{2}[n]}{(1-\lambda [n])^{3}}.\tag{27}\end{equation*}
\begin{align*} \lambda [n]=&\frac {\mu -1}{\mu +1}, \quad \text {if} \quad \lambda [n]>0 \\ \mu=&(2\sigma _\Sigma ^{2}[n]\text {Tr}(\Omega)/\sigma _{h}^{2}[n])^{1/3}.\tag{28}\end{align*}
\begin{align*}\hat {\sigma }_{h}^{2}[n+1] &=\lambda [n]\hat {\sigma }_{h}^{2}[n] \\&\qquad {{{+\,(1-\lambda [n])\sum \nolimits _{k=1}^{K}(\hat {{\boldsymbol {h}}}_{k}^{(1)H}[n]\hat {{\boldsymbol {h}}}_{k}^{(1)}[n]),} }}\tag{29}\end{align*}
\begin{equation*} \hat {\sigma }_{h}^{2}[n+1]=\lambda [n]\hat {\sigma }_{h}^{2}[n]+(1-\lambda [n])\Vert \Delta {\boldsymbol {W}}[n]\Vert _{F}^{2},\tag{30}\end{equation*}
\begin{equation*} \hat {\sigma }_\Sigma ^{2}[n+1]=\lambda _{L}\hat {\sigma }_\Sigma ^{2}[n]+(1-\lambda _{L})\Vert {\boldsymbol {e}}[n]\Vert ^{2}.\tag{31}\end{equation*}
Using the VFF, one can vary the FF according to (28) in the recursive EIV algorithm in Tables 1 and 2 for tracking the subspace for computing
The convergence of the recursive RSI algorithm has been studied in [42]. Since we are using a VFF SREIV-PAST algorithm instead of the conventional PAST algorithm, we will study the convergence of the variable forgetting factor (VFF) EIV method in the supplementary material using the ordinary differential equation (ODE) method [40]. It is found that under reasonable assumptions and conditions (C1) and (C2), the algorithm converges to the true subspace in stationary environment if
Estimated denominator coefficients using ECF-ICF-RSI in [22], and proposed ECF-ILF-RSI, ELF-ICF-RSI, ELF-ILF-RSI and SC-ELF-ILF-RSI algorithms under 17 dB SNR. ‘ECF’ and ‘ELF’ denote the SREIV-PAST algorithm with constant FF and local optimal FF, respectively. ‘ICF’ and ‘ILF’ represent the recursive algorithm to estimate the input-related matrices
Estimated numerator coefficients using ECF-ICF-RSI in [22], and proposed ECF-ILF-RSI, ELF-ICF-RSI, ELF-ILF-RSI, SC-ELF-ILF-RSI, PI-ECF-ICF-RSI, PI-SC-ELF-ILF-RSI, algorithms under 17 dB SNR.
For time-varying subspaces, the MSE mainly depends on the subspace variation and the estimate may not converge to a fixed value. In the supplementary material, the stability of the proposed VFF algorithm is analyzed. As expected, if the system is persistently excited, i.e.
Next, we consider the estimation of
B. The VFF RLS Algorithm for Estimating B
and D
Since
Moreover, the MSD under the LPM modeling of \begin{equation*} \tilde {J}_{MSD}(n)\approx \frac {1-\tilde {\lambda }[n]}{1+\tilde {\lambda }[n]}\tilde {\sigma }_{\Sigma }^{2}\text {Tr}({\boldsymbol {R}}_{\Phi \Phi }^{-1})+\frac {\tilde {\lambda }^{2}[n]\sigma _\theta ^{2}[n]}{(1-\tilde {\lambda }[n])^{2}}.\tag{32}\end{equation*}
\begin{align*} \tilde {\lambda }_{opt}[n]=&\frac {\tilde {\mu }-1}{\tilde {\mu }+1}, \quad \text {if} \quad \tilde {\lambda }_{opt}[n]>0 \\ \tilde {\mu }=&(2\tilde {\sigma }_\Sigma ^{2}[n]\text {Tr}({\boldsymbol {R}}_{\Phi \Phi }^{-1})/\sigma _\theta ^{2}[n])^{1/3}.\tag{33}\end{align*}
C. Model Order Estimation for the Proposed LOFF-EIV Recursive Subspace Identification (RSI) Algorithms
1) The Proposed Model Order Estimation for A
With the LOFF-recursive EIV algorithm, we can estimate the subspace estimate
In some real applications, one may wish to estimate the order of the matrix
Classical model order estimation algorithm such as Akaike Information Criterion (AIC) and Minimum Description Length (MDL) usually assume that the minor singular values are equal. However, this assumption may not always be met in practice. Here, we shall investigate another hypothesis proposed in [43] which aims to find the model order such that the proportion of variance explained by the minor components is less than a certain value.
Let \begin{equation*} \hat {\Psi }_{K}=\left({\sum \nolimits _{k=1}^{K}\hat {l}_{k}}\right)/\left({\sum \nolimits _{p=1}^{P}\hat {l}_{p}}\right)\tag{34}\end{equation*}
\begin{equation*} \tau _{K}^{2}=\frac {2\text {Tr}({\boldsymbol {A}}^{2})}{(N-1)(\text {Tr}(\boldsymbol{D}))^{2}}[\Psi _{K}^{2}-2\alpha \Psi _{K}+\alpha]\tag{35}\end{equation*}
\begin{equation*} \left [{\hat {\Psi }_{K}-\chi \sqrt {\tau _{K}^{2}/N},\hat {\Psi }_{K}+\chi \sqrt {\tau _{K}^{2}/N}}\right]\tag{36}\end{equation*}
2) Automatic Model Estimation of B
and C
Using Scad Regularization
For the matrices \begin{equation*} E_{SC}(\theta)=\sum \nolimits _{i=1}^{n}\lambda _{n,i}(y[i]-\Phi ^{\mathsf {H}}[i]\theta)^{2}+\sum \nolimits _{i=1}^{L}f_\mu (\theta _{i}),\tag{37}\end{equation*}
\begin{align*} f_\mu (\theta)\! = \!\begin{cases} \displaystyle {\mu \vert \theta \vert,} & {\text {for}\quad \vert \theta \vert \leq \mu } \\ \displaystyle {-\frac {(\vert \theta \vert -\tilde {a}\mu)^{2}}{2(\tilde {a}-1)}\!+\!\frac {(\tilde {a}+1)\mu ^{2}}{2},} & {\text {for}\quad \mu < \vert \theta \vert \leq \tilde {a}\mu } \\ \displaystyle {(\tilde {a}+1)\mu ^{2}/2,} & {\text {for}\quad \vert \theta \vert >\tilde {a}\mu } \end{cases}\!\!\!\!\!\!\!\!\!\!\!\! \\{}\tag{38}\end{align*}
\begin{equation*} \mu [n]=(1-\tilde {\lambda }[n])L\bar {\sigma }_\phi ^{2}[n]\sqrt {\tilde {\sigma }_\Sigma ^{2}[n]/(\text {Tr}({\boldsymbol {R}}_{\Phi \Phi })+\epsilon)M_{1}},\tag{39}\end{equation*}
By setting its derivative of (37) with respect to \begin{align*} (\tilde {\lambda }[n]{\boldsymbol {R}}_{\Phi \Phi }[n-1]+\phi [n]\phi ^{\mathsf {H}}[n]+{\boldsymbol {K}}[n]){{\theta}}_{opt}[n]={\boldsymbol {r}}_{\Phi Y}[n] \\{}\tag{40}\end{align*}
\begin{align*} {\boldsymbol {K}}[n]=&\text {diag}(\kappa _{1}[n],\ldots,\kappa _{L}[n]), \tag{41a}\\ \kappa _{i}[n]=&f_\mu '(\theta _{i}[n-1])/\theta _{i}[n-1], \tag{41b}\\ f_\mu '(\theta)=&\begin{cases} \displaystyle {\text {sgn}(\theta)\mu,} & {\text {for}\quad \vert \theta \vert \leq \mu } \\ \displaystyle {\frac {(-\theta +\text {sgn}(\theta)\tilde {a}\mu }{\tilde {a}-1,}} & {\text {for}\quad \mu < \vert \theta \vert \leq \tilde {a}\mu } \\ \displaystyle {0,} & {\text {for}\quad \vert \theta \vert >\tilde {a}\mu }, \end{cases}\tag{41c}\end{align*}
We note that the desired solution
3) Arithmetic Complexity Comparison
The arithmetic complexities of various algorithms are compared in Table 4. Here, notation convenience, we shall use ‘ECF’ and ‘ELF’ to denote the SREIV-PAST algorithm with constant and local optimal FF, respectively. ‘ICF’ and ‘ILF’ are used to denote respectively the recursive algorithm for estimating the input-related matrices
The complexities of the SREIV-PAST and multivariate RLS algorithms are respectively
Numerical Example
We now evaluate the performance of the proposed algorithms by comparing them with other conventional approaches using simulated experimental and real measurement data. Both stationary and dynamic environments are investigated. The SISO and MIMO systems tested are widely used in the literature for comparison [33], [34], [42]. For simplicity, we follow the notations for the proposed algorithms and other compared approaches in section III-3 and Table 4.
A. A SISO Time-Invariant System
We first present simulation results on the following SISO time invariant system studied in [33]:\begin{equation*} y[n]=\frac {q^{-1}-0.5q^{-2}}{1-1.5q^{-1}+0.7q^{-2}}u[n]+\frac {1}{1-0.8q^{-1}}e[n],\tag{42}\end{equation*}
\begin{align*} {\boldsymbol {x}}[n+1]= \begin{bmatrix} 1.5 & -0.7 \\ 1 & 0 \\ \end{bmatrix}{\boldsymbol {x}}[n]+ \begin{bmatrix} 1 \\ 0 \\ \end{bmatrix} u[n], \tag{43a}\\ y[n]= \begin{bmatrix} 1 & -0.5 \\ \end{bmatrix}{\boldsymbol {x}}[n]+ \begin{bmatrix} 0 \\ \end{bmatrix}u[n]+v[n]\tag{43b}\end{align*}
Since the model order is unknown, we use our proposed approach to estimate the model order compared with the well-known criteria such as AIC and MDL. Different model orders of 2, 3 and 4 are used, and the result is shown in Table 5. It is shown that our proposed approach can give a consistent and accurate estimate over the AIC and MDL criteria for the tested systems.
The comparison of the proposed algorithms with other methods is shown in Fig. 2 and Fig. 3. It can be seen that the proposed ELF-based algorithms converge considerably faster than the RSI algorithm with a fixed FF for denominator coefficients estimation. The use of SCAD regularization also improves slightly the steady-state accuracy. The change of forgetting factor in the ELF-based algorithms is illustrated in Fig. 4. It can be seen the value of the FF gradually increases to a large value which reduces the steady state MSE. For estimating the numerator coefficients, the ILF-based algorithms converge much faster than the corresponding constant FF-based algorithms. It should be noted that the proposed variable FF can also speed up the convergence performance for PI-based algorithms. The change of the forgetting factor in ILF-based algorithms is shown in Fig. 5. The small FF utilized in the initial stage can achieve faster convergence speed while the large FF used at the steady state yields a smaller MSE.
Next, we compare the steady state performance of the ECF-ICF-RSI, ELF-ICF-RSI, ECF-ILF-RSI and SC-ELF-ILF-RSI algorithms in stationary environment. The SNR is kept at 17dB. The Bode diagrams of different algorithms at the 1000-th time-point are shown in Fig. 6. It can be seen that the SC-ELF-ILF-RSI algorithm gives the smaller error from the ground-truth. The ELF-based algorithms are more accurate due to the more accurate estimation of the extended observability matrix
B. SISO Systems With Time-Varying A
Consider the following time-varying system studied in [15] \begin{align*} {\boldsymbol {A}}[n]= \begin{bmatrix} 1.5+0.05\dfrac {e^{-n/1000}-1}{e^{-1}-1} & -0.7 \\ 1 & 0-0.03\dfrac {e^{-n/1000}-1}{e^{-1}-1} \\ \end{bmatrix}\! \\{}\tag{44}\end{align*}
\begin{equation*} v[n]=\frac {1}{1-0.8q^{-1}}e[n],\tag{45}\end{equation*}
The criterion chosen to measure and to compare the performance was subspace angle [45]–[47] in degree between \begin{align*} \cos (\theta _{n})=&\max \limits _{{\boldsymbol {u}}\in \Gamma }\max \limits _{{\boldsymbol {v}}\in \hat {\Gamma }}{\boldsymbol {u}}^{\mathsf {T}}{\boldsymbol {v}}={\boldsymbol {u}}_{n}^{\mathsf {T}}{\boldsymbol {v}}_{n}, \\[-2pt]&\text {s.t.} ~\Vert {\boldsymbol {u}}\Vert _{2}^{2}=\Vert {\boldsymbol {v}}\Vert _{2}^{2}=1,\quad {\boldsymbol {u}}_{i}^{\mathsf {T}}{\boldsymbol {u}}_{i}=0, \\[-2pt]&\hphantom {\text {s.t.} ~}{\boldsymbol {v}}_{i}^{\mathsf {T}}{\boldsymbol {v}}_{i}=0,\quad i=1,2,\ldots,n-1\tag{46}\end{align*}
Estimated subspace angle using ECF-ICF-RSI in [22], and proposed ECF-ILF-RSI, ELF-ICF-RSI and SC-ELF-ILF-RSI algorithms under 25 dB SNR in nonstationary environment.
To better illustrate the advantage of the proposed VFF algorithms, we also consider a system with poles closer to the origin similar to the last system:\begin{equation*} {\boldsymbol {A}}[n]\!=\! \begin{bmatrix} 1\!+\!0.1\dfrac {e^{-n/1000}-1}{e^{-1}-1} & -0.7 \\ 1 & 0-0.05\dfrac {e^{-n/1000}\!-\!1}{e^{-1}-1} \\ \end{bmatrix}\tag{47}\end{equation*}
Estimated subspace angle using ECF-ICF-RSI in [22], and proposed ECF-ILF-RSI, ELF-ICF-RSI and SC-ELF-ILF-RSI algorithms under 25 dB SNR in nonstationary environment with poles closer to the origin.
C. A SISO System With Time-Varying B
Next, we consider a system similar to the last time-varying system with \begin{equation*} {\boldsymbol {B}}[n]\!=\! \begin{bmatrix} 1+0.6\dfrac {e^{-n/1000}-1}{e^{-1}-1} & 0-0.2\dfrac {e^{-n/1000}-1}{e^{-1}-1}. \\ \end{bmatrix}\tag{48}\end{equation*}
D. A MIMO Time-Invariant System
Here, we consider an \begin{align*} {\boldsymbol{A}}=&\begin{bmatrix} 0.7 &\, 0.7 &\, 0 &\, 0 \\ -0.7 &\, 0.7 &\, 0 &\, 0 \\ 0 &\, 0 &\, -0.7 &\, -0.7 \\ 0 &\, 0 &\, 0.7 &\, -0.7 \\ \end{bmatrix},\quad {\boldsymbol {B}}\!=\!\begin{bmatrix} 1.1650 &\, -0.6965 \\ 0.6268 &\, 1.6961 \\ 0.0751 &\, 0.0591 \\ 0.3561 &\, 1.7971 \\ \end{bmatrix} \\ \tag{49a}\\ {\boldsymbol{C}}=&\begin{bmatrix} 0.2641 & -1.4462 & 1.2460 & 0.5774 \\ 0.8717 & -0.7012 & -0.6390 & -0.3600 \\ \end{bmatrix}, \tag{49b}\\ {\boldsymbol{D}}=&\begin{bmatrix} -0.1356 & -1.2704 \\ -1.3493 & 0.9846 \\ \end{bmatrix}\tag{49c}\end{align*}
\begin{align*} {\boldsymbol{K}}=&\begin{bmatrix} 0.4968 & -0.3580 \\ -0.3312 & -0.0512 \\ 0.1560 & -0.3872 \\ -0.0900 & 0.5836 \\ \end{bmatrix}, \\ {\boldsymbol {R}}_{v}=&\begin{bmatrix} 0.0176 & -0.0267 \\ -0.0267 & 0.0497 \\ \end{bmatrix}.\tag{50}\end{align*}
Estimated real part of poles using ECF-ICF-RSI in [22], and proposed ELF-ICF-RSI and SC-ELF-ILF-RSI algorithms in a MIMO system.
Estimated imaginary part of poles using ECF-ICF-RSI in [22], and proposed ELF-ICF-RSI and SC-ELF-ILF-RSI algorithms in a MIMO system.
Next, we compare the accuracy of the estimated poles obtained by the ECF-ICF-RSI, ELF-ILF-RSI and SC-ELF-ILF-RSI algorithms. The eigenvalues of the estimated
Estimated poles using ECF-ICF-RSI algorithm in [22] for a Monte Carlo simulation of size 100.
Estimated poles using ELF-ILF-RSI algorithm for a Monte Carlo simulation of size 100.
Estimated poles using SC-ELF-ILF-RSI algorithm for a Monte Carlo simulation of size 100.
E. Application to a Wing Flutter Data
In this section, the proposed algorithms are tested using the wing flutter data [49] from the DaISy (Database for the Identification of Systems). The data consists of 1024 time points with single input and single output. The input signal is highly colored and the response is acquired via an accelerometer sensor in real time. The mathematical model of the mechanical system is unknown. Therefore, we use our proposed approach and the model order estimated is equal to 5. We tested different numbers of block rows with \begin{equation*} \text {FIT}=100\times \left ({1-\frac {\Vert y-\hat {y}\Vert }{\Vert y-\bar {y}\Vert }}\right),\tag{51}\end{equation*}
\begin{equation*} \text {SSE}=\sum \nolimits _{i=1}^{N}(y_{i}-\hat {y}_{i})^{2},\tag{52}\end{equation*}
\begin{equation*} \text {SST}=\sum \nolimits _{i=1}^{N}(y_{i}-\bar {y}_{i})^{2},\tag{53}\end{equation*}
It can be seen that
Conclusion
A new SCAD regularized LOFF-based algorithm for recursive subspace model identification has been presented. The VFF parameter is obtained by minimizing the MSE of the SREIV-PAST algorithm when the observability matrix is modeled by local polynomial modeling. A SCAD regularization term is also incorporated to reduce the estimation variance of the subspace during signal fading. Applications of the proposed algorithms under stationary and non-stationary environments show that the proposed algorithms offer considerably improved performance over the conventional and state-of-the-art algorithms.
AppendixMean Squares Deviation (MSD) of the Estimator
Mean Squares Deviation (MSD) of the Estimator
As mentioned, our aim is to find the FF so as to minimize the MSD of the estimator. Therefore, we need to evaluate the MSD of the estimator, which can be written as\begin{equation*} J_{MSD}({\boldsymbol {w}}_{k})=\mathbb{E}\left [{\Vert {\boldsymbol {w}}_{k}[n]-{\boldsymbol {h}}_{k}[n]\Vert _{2}^{2}}\right]. \tag{A.1}\end{equation*}
\begin{equation*} {\boldsymbol {w}}_{k}[n]-{\boldsymbol {h}}_{k}[n]={\boldsymbol {b}}_{k}[n]+{\boldsymbol {v}}_{k}[n] \tag{A.2}\end{equation*}
\begin{equation*} J_{MSD}({\boldsymbol {w}}_{k})=\mathbb{E}\left [{\Vert {\boldsymbol {b}}_{k}[n]\Vert _{2}^{2}}\right]+\mathbb{E}\left [{\Vert {\boldsymbol {v}}_{k}[n]\Vert _{2}^{2}}\right]. \tag{A.3}\end{equation*}
\begin{equation*} \mathbb{E}\left [{{\boldsymbol {w}}_{k}[n]}\right]={\boldsymbol {h}}_{k}[n]+{\boldsymbol {B}}[n]\bar {{\boldsymbol {h}}}_{k}^{(1)}[n], \tag{A.4}\end{equation*}
\begin{align*} {\boldsymbol {R}}_\tau [n]\mathop = \limits ^{n \to \infty }&{\boldsymbol {G}}^{\mathsf {H}}[n]{\boldsymbol {D}}_\lambda [n]{{\Xi}}[n]{{\Xi}}^{\mathsf {H}}[n]{\boldsymbol {D}}_\lambda [n]{\boldsymbol {D}}_\tau [n]{\boldsymbol {G}}[n] \\\mathop =\limits ^{n \to \infty }&-(1\!-\!\lambda [n])^{-1}{\boldsymbol {C}}_{\xi h}^{\mathsf {H}}\sum \nolimits _{i=1}^{n}\lambda ^{n-i}[n]{{\xi}}[i]{\boldsymbol {g}}^{\mathsf {H}}[i](n-i) \\\mathop \approx \limits ^{n \to \infty }&-(1-\lambda [n])^{-1}{\boldsymbol {C}}_{\xi h}^{\mathsf {H}}\sum \nolimits _{i=0}^{n-1} i\lambda ^{i}[n]{\boldsymbol {C}}_{yy} \\\mathop = \limits ^{n \to \infty }&-(1-\lambda [n])^{-1}\lambda [n] \\&\times \,\left ({\frac {1 -\lambda ^{n-1}[n]}{(1- \lambda [n])^{2}} - \frac {(n - 1){\lambda ^{n-1}}[n]}{1-\lambda [n]}}\right){\boldsymbol {C}}_{\xi g}^{\mathsf {H}}{\boldsymbol {C}}_{\xi g} \\\mathop \approx \limits ^{n \to \infty }&- \frac {\lambda [n]}{(1-\lambda [n])^{3}}{\boldsymbol {C}}_{\xi g}^{\mathsf {H}}{\boldsymbol {C}}_{\xi g}. \tag{A.5}\end{align*}
Using (A.5 and the above results for \begin{equation*} {\boldsymbol {b}}_{k}[n]={\boldsymbol {B}}[n]\bar {{\boldsymbol {h}}}_{k}^{(1)}\mathop = \limits ^{n \to \infty } -\frac {\lambda [n]}{1-\lambda [n]}\bar {{\boldsymbol {h}}}_{k}^{(1)}[n]. \tag{A.6}\end{equation*}
\begin{equation*} \mathbb{E}\left [{\Vert {\boldsymbol {b}}_{k}[n]\Vert _{2}^{2}}\right]=\frac {\lambda ^{2}[n]}{(1-\lambda [n])^{2}}\zeta _{k}, \tag{A.7}\end{equation*}
\begin{equation*} {\boldsymbol {v}}_{k}[n]={\boldsymbol {C}}[n]({{\eta}}_{k}[n]+{{\varsigma}}_{k}[n])+{\boldsymbol {B}}[n]\delta {\boldsymbol {h}}_{k}^{(1)}[n], \tag{A.8}\end{equation*}
\begin{align*} \mathbb{E}\left [{\Vert {\boldsymbol {v}}_{k}[n]\Vert _{2}^{2}}\right]=&\mathbb{E}\left [{\text {Tr}({\boldsymbol {C}}^{\mathsf {H}}[n]{\boldsymbol {C}}[n]{{\eta}}_{k}[n]{{\eta}}_{k}^{\mathsf {H}}[n])}\right] \\&+\,\mathbb{E}\left [{\text {Tr}({\boldsymbol {C}}^{\mathsf {H}}[n]{\boldsymbol {C}}[n]{{\varsigma}}_{k}[n]{{\varsigma}}_{k}^{\mathsf {H}}[n])}\right] \\&+\,\mathbb{E}\left [{\text {Tr}({\boldsymbol {B}}[n]\delta {\boldsymbol {h}}_{k}^{(1)}\delta {\boldsymbol {h}}_{k}^{(1)H}[n]{\boldsymbol {B}}[n])}\right]\qquad \tag{A.9}\end{align*}
\begin{align*}\mathbb{E}\left [{\Vert {\boldsymbol {v}}_{k}[n]\Vert _{2}^{2}}\right] &=\frac {K\sigma _{\delta h_{k}}^{2}[n]\lambda ^{2}[n]}{(1-\lambda [n])^{2}} \\&\!{{{+\,\frac {1-\lambda [n]}{1+\lambda [n]}(\sigma _{\eta _{k}}^{2}+\sigma _{r_{k}}^{2}\text {Tr}({\boldsymbol {C}}_{\xi \xi }))\text {Tr}(\Omega),}}} \tag{A.10}\end{align*}
\begin{align*}&\hspace {-1.2pc}\mathbb{E}\left [{\text {Tr}({\boldsymbol {C}}^{\mathsf {H}}[n]{\boldsymbol {C}}[n])}\right] \\ {\stackrel{{\hspace {1.6pc}}}{{\vphantom{\mathop = \limits ^{n \to \infty }}}{=}}}&\text {Tr}(({\boldsymbol {C}}_{\Xi G}^{\mathsf {H}}[n]{\boldsymbol {C}}_{\Xi G}[n])^{-H}({\boldsymbol {C}}_{\Xi G}^{\mathsf {H}}[n]{\boldsymbol {C}}_{\Xi G})^{-1} \\&\cdot \,{\boldsymbol {G}}^{\mathsf {H}}[n]{\boldsymbol {D}}_\lambda [n]{{\Xi}}[n]{{\Xi}}^{\mathsf {H}}[n]{\boldsymbol {D}}_\lambda ^{2}[n]{{\Xi}}[n]{{\Xi}}^{\mathsf {H}}[n]{\boldsymbol {D}}_\lambda [n]{\boldsymbol {G}}[n]) \\\mathop = \limits ^{n \to \infty }&(1-\lambda [n])^{2}\text {Tr}(({\boldsymbol {C}}_{\xi g}^{\mathsf {H}}{\boldsymbol {C}}_{\xi g})^{-1}{\boldsymbol {C}}_{\xi g}^{\mathsf {H}}{{\Xi}}^{\mathsf {H}}[n]{\boldsymbol {D}}_\lambda ^{2}[n]{{\Xi}}[n] \\&\cdot \,{\boldsymbol {C}}_{\xi g}({\boldsymbol {C}}_{\xi g}^{\mathsf {H}}{\boldsymbol {C}}_{\xi g})^{-H}) \\\mathop = \limits ^{n \to \infty }&\frac {1-\lambda [n]}{1+\lambda [n]}\text {Tr}(({\boldsymbol {C}}_{\xi g}^{\mathsf {H}}{\boldsymbol {C}}_{\xi g})^{-1}{\boldsymbol {C}}_{\xi g}^{\mathsf {H}}{\boldsymbol {C}}_{\xi \xi }{\boldsymbol {C}}_{\xi g}({\boldsymbol {C}}_{\xi g}^{\mathsf {H}}{\boldsymbol {C}}_{\xi g})^{-H}), \\{}\tag{A.11}\end{align*}
Consequently, \begin{equation*} J_{MSD}({\boldsymbol {w}}_{k}[n])\approx \frac {1-\lambda [n]}{1+\lambda [n]}\sigma _{\Sigma _{k}}^{2}\text {Tr}(\Omega)+\frac {\lambda ^{2}[n]\sigma _{h_{k}}^{2}[n]}{(1-\lambda [n])^{2}}, \qquad \quad \tag{A.12}\end{equation*}
\begin{equation*} J_{MSD}[n]\approx \frac {1-\lambda [n]}{1+\lambda [n]}\sigma _{\Sigma }^{2}\text {Tr}(\Omega)+\frac {\lambda ^{2}[n]\sigma _{h}^{2}[n]}{(1-\lambda [n])^{2}}. \quad \tag{A.13}\end{equation*}