Introduction
Natural gas network state estimation is a integral part in natural gas system modeling. As a typical clean energy, natural gas is popularized in various industrial fields over time, thus, it is of great significance to maintain the stable operation of natural gas system. With the continuous expansion of the natural gas application scope, the network structure and operation mode of natural gas system become diversified and complex. Accordingly, accurate and effective natural gas system modeling is quite indispensable. More precisely, the state estimation of natural gas network model is the basis of natural gas system modeling [1].
For the state estimation of natural gas network, the existing research focuses on the state estimation in the light of the physical model of natural gas network, and converts the state estimation problem into an optimization problem for solution. Besides, some numerical methods are employed to linearize the dynamic equations of gas flow, such as finite difference method and finite element method. In [2], [3], and [4], the Least Square (LS) method is utilized to tackle the state estimation optimization problem. In [5], [6], and [7], state estimation methods based on optimization for the pressure and flow changes at the inlet and outlet of the pipeline caused by the large consumption and increase of natural gas were presented. Apart from the state estimation based on the optimization, some research focus on the linearization of natural gas flow equations using finite difference method and finite element method [8], [9], [10]. Thereinto, a cascade control algorithm monitoring the pressure of natural gas pipeline on the basis of state space model was established in [8]. In [9], a general modeling method of gas transmission pipeline network based on transfer function was developed. All of above state estimation methods concentrated on pipeline pressure and flow. Moreover, some studies took into account the impacts of temperature change on natural gas flow in the pipeline. In the process of natural gas transmission, the main factors giving rise to temperature changes are the changes of potential energy and kinetic energy, as well as the friction between natural gas and internal pipe wall. In [11] and [12], the natural gas temperature was taken as measurements to estimate the influence of temperature change on the overall flow of natural gas in the pipeline. Furthermore, a dissipative finite volume discretization scheme was presented to cope with the isothermal flow equation of natural gas [13]. All the above methods mainly concerned the impact of temperature on natural gas flow, whereas compared with pipeline pressure and flow, the influence of temperature on natural gas network is somewhat negligible.
All the methods mentioned above were based on the transient natural gas flow equation for state estimation. Nonetheless, as a consequence of the calculation complexity for partial differential equations (PDEs), approximate calculation is adopted normally, and the estimation result is inaccurate to a certain extent. Accordingly, they are just applicable to small-scale and simple natural gas systems, and measurement errors are neglected. As large-scale and complex natural gas networks emerge, a series of data-based methods were gradually developed and exploited to address the problems of natural gas system state estimation.
In comparison to the method based on natural gas physical model, the data-based method has wide applicability, on account of its high computational efficiency and does not entirely depend on the physical models, and does not even need to strictly infer the model structure, and it is capable of estimating natural gas system state accurately as well. For example, deep learning was utilized to monitor dynamic state of dynamic state of pipe network in [14]. In [15], an operator splitting method for simulating isothermal compressible natural gas flow on gas transmission pipeline was proposed. Although [14] and [15] conquered the complex modeling problem of physical model, they all ignored the measurement errors, hence, accurate data was required to construct effective models. In practice, in the process of data acquisition, measurement errors result from technology, equipment and other factors are inevitable. Unfortunately, the data-based state estimation methods rely heavily on the characteristics of the data itself, thereby, the accuracy of the model is determined by both input and output measurements concurrently. The above methods merely considered the output errors in the modeling process, they may have impacts on the whole model and affect the accuracy of the model even further.
In order to tackle the problem of measurement errors, there exist some methods based on Kalman filter were employed in state estimation. Their main content was that finite element numerical approximation method was utilized to deal with PDEs, meanwhile, measurement errors were concerned in modeling, including the heuristic distributed filtering method for high pressure and long distance natural gas network [16], Low-Rank Kalman filter method based on projection reduction [17], robust Kalman filter [18], discrete Kalman filter [19], etc. These Kalman filter based methods separate the noise from the measurement data via the measurement noise filtering. Despite the random error partly reduced, as a result of the different modes and intensity of noise processing, the real data gained after noise processing may have information loss or redundancy, which may give rise to the real data characteristics changes, therefore, it is vulnerable to bad data, and may be robust to bad data as usual, these will lead to error superposition and propagation. So far as natural gas networks state estimation is concerned, in addition to Kalman filter, there are few other methods concentrating on measurement errors.
Aiming for effective estimation of natural gas networks with measurement errors in both input and output, a data-driven framework combining maximum likelihood estimation with measurement noise in input and weighted low-rank approximation is proposed to estimate the state of pressure at node and pipeline flow in natural gas networks. The gas flow characteristic equation of natural gas system is transformed into the form of transfer matrix with input and output ports, and the original problem is constructed as a weighted low-rank approximation problem, and Naive Riemannian Stochastic Descent [20] is employed to solve this optimization problem. In order to investigate the validity and feasibility of the proposed model, experiment is tested on 10-node natural gas network. To observe the estimation results under different noise levels, the noise with different levels is added to measurement values, and the convergence under different iteration times is verified at each noise level. Comparing the estimation results corresponding to different initial values, it confirms that the presented method is not affected by the iterative initial values. Then, compared with the Newton’s method [21], it is validated under different training samples numbers to illustrate the performance of this method from different aspects. In difference to the existing data-based methods, the method in this paper has the following contributions:
A method combined maximum likelihood estimation and weighted low-rank approximation is presented to estimate state of natural gas network with measurement error in both input and output measurements. There are no needs for filtering and denoising, as well as other noise processing, these preserve the complete information of real data, which is different than other noise processing methods.
The original state estimation problem is transformed into the optimization problem for searching along the tangent direction map in the orthogonal complement space, which not only shrinks the search space, but determines the search direction different than the random descent direction, so that the initial value have little influence on the result during iteration, one can gain estimated parameters without complex calculation.
The state estimation parameters are capable of updating in time according to the changes of network structure.
The remainder of the paper is organized as follows: Section II provides the problem statement for a natural gas network. Section III presents the state estimation model of natural gas network with measurement errors and the related optimization problem solution algorithm. Section IV validates the effectiveness and accuracy of the proposed method via different experiments. Section V concludes the paper.
Problem Statements
The natural gas network is mainly composed of pipelines, nodes and compressors. Typically, compressors merely reside in typical large-scale natural gas networks, while compressors are commonly not taken into account in conventional natural gas networks with relatively stable pressure and gas mass flow. A schematic diagram of natural gas pipeline network is shown in Figure 1. Thereinto, nodes are connected with the demand side and the supply side respectively. Besides, nodes are connected through natural gas pipelines, and the natural gas flow in the pipeline is supplied to the demand side through nodes.
Normally, it is assumed that the temperature of natural gas flow in the pipeline is equal to the ambient temperature. In practice, as a result of the natural gas load and supply are in constant change, stable operation is hard to maintain, consequently, the simplified mass conservation and momentum conservation equations are employed to depict dynamic characteristics of natural gas systems [22], [23]. These equations express the transient behaviors of natural gas flow in the pipeline, and the dynamic natural gas system can be described as: \begin{align*} \frac {\partial \left ({\pi }\right)}{\partial t}+\frac {ZRT}{S}\cdot \frac {\partial \left ({{\dot {G}} }\right)}{\partial L}&=0 \tag{1a}\\ \frac {\partial \left ({\pi }\right)}{\partial L}+\frac {f\cdot ZRT\cdot \dot {G}\left |{ {\dot {G}} }\right |}{2dS^{2}\pi }&=0 \tag{1b}\end{align*}
Since the above physical model is nonlinear, it is elaborate to directly calculate the PDEs. Thereby, discretizing the PDEs is imperative, here, the finite difference method is utilized to convert the PDEs into ordinary differential equations. We suppose that the mass flow direction does not change in the natural gas pipeline, and the time step is \begin{align*} \frac {\Delta \pi _{i+1}^{t} -\Delta \pi _{i+1}^{t-1}}{\Delta t}+\frac {ZRT}{S}\cdot \frac {\Delta \dot {G}_{i+1}^{t} -\Delta \dot {G}_{i}^{t}}{\Delta L}&=0 \tag{2a}\\ \frac {\Delta \pi _{i+1}^{t} -\Delta \pi _{i}^{t}}{\Delta L}+\frac {f\cdot ZRT\cdot \dot {G}_{st}}{2dS^{2}\pi _{st}}\cdot \frac {\Delta \dot {G}_{i+1}^{t} +\Delta \dot {G}_{i}^{t}}{2}&=0 \tag{2b}\end{align*}
\begin{align*} \left [{ {\begin{array}{l} \Delta \pi _{i+1}^{t} \\ \Delta \dot {G}_{i+1}^{t} \\ \end{array}} }\right]=A\left [{ {\begin{array}{l} \frac {\Delta \pi _{i+1}^{t-1}}{\Delta t}+\frac {ZRT\cdot \Delta \dot {G}_{i}^{t}}{S\cdot \Delta L} \\ \frac {\Delta \pi _{i}^{t}}{\Delta L}-\frac {f\cdot ZRT\cdot \dot {G}_{st} \cdot \Delta \dot {G}_{i}^{t}}{4dS^{2}\pi _{st}} \\ \end{array}} }\right] \tag{3}\end{align*}
\begin{equation*} \sum \limits _{i\to } {\dot {G}_{i,j}} -\sum \limits _{\to i} {\dot {G}_{k,i} } +\dot {G}_{i}^{load} -\dot {G}_{i}^{inject} =0 \tag{4}\end{equation*}
For a natural gas network with \begin{align*} \left [{{\begin{array}{l} \Delta \pi _{j,L}^{t} \\ \Delta \dot {G}_{j,L}^{t} \\ \end{array}} }\right]&=A\left [{ {\begin{array}{l} \frac {\Delta \pi _{j,L}^{t-1}}{\Delta t}+\frac {ZRT\cdot \Delta \dot {G}_{i,0}^{t}}{S\cdot L_{i,j}} \\ \frac {\Delta \pi _{i,0}^{t}}{L_{i,j}}-\frac {f\cdot ZRT\cdot \dot {G}_{st} \cdot \Delta \dot {G}_{i,0}^{t}}{4dS^{2}\pi _{st}} \\ \end{array}} }\right], \\ i,j&\in \left \{{{1,\ldots,N_{n}} }\right \} \tag{5}\end{align*}
\begin{align*} A&=\left [{ {{\begin{array}{cccccccccccccccccccc} {A_{11}} & {A_{12}} \\[-2pt] {A_{21}} & {A_{22}} \\[-2pt] \end{array}}} }\right] \tag{6a}\\ &\begin{cases} \displaystyle {A_{11} ={f\cdot \dot {G}_{st} \cdot \Delta t\cdot L_{i,j}^{2}} \mathord {\left /{ {\vphantom {{f\cdot \dot {G}_{st} \cdot \Delta t\cdot L_{i,j}^{2}} {A_{0}}}} }\right. } {A_{0}}} \\[-2pt] \displaystyle {A_{12} ={-4d\cdot S\cdot \pi _{st} \cdot \Delta t\cdot L_{i,j}} \mathord {\left /{ {\vphantom {{-4d\cdot S\cdot \pi _{st} \cdot \Delta t\cdot L_{i,j}} {A_{0}}}} }\right. } {A_{0}}} \\[-2pt] \displaystyle {A_{21} ={-4d\cdot S^{2}\cdot \pi _{st} \cdot \Delta t\cdot L_{i,j}} \mathord {\left /{ {\vphantom {{-4d\cdot S^{2}\cdot \pi _{st} \cdot \Delta t\cdot L_{i,j}} {\left ({{ZRT\cdot A_{0}} }\right)}}} }\right. } {\left ({{ZRT\cdot A_{0}} }\right)}} \\[-2pt] \displaystyle {A_{22} ={4d\cdot S^{2}\cdot \pi _{st} \cdot L_{i,j}^{2}} \mathord {\left /{ {\vphantom {{4d\cdot S^{2}\cdot \pi _{st} \cdot L_{i,j}^{2}} {\left ({{ZRT\cdot A_{0}} }\right)}}} }\right. } {\left ({{ZRT\cdot A_{0}} }\right)}} \end{cases}\tag{6b}\\[-2pt] A_{0} &=f\cdot \dot {G}_{st} L_{i,j}^{2} -4d\cdot S\cdot \pi _{st} \cdot \Delta t \tag{6c}\end{align*}
\begin{align*} Y&=AX \tag{7}\\[-2pt] Y&=\left [{ {\Delta \pi _{j,L}^{t},\Delta \dot {G}_{j,L}^{t}} }\right]^{\mathrm{ T}} \tag{8}\end{align*}
\begin{align*} X&=\left [{ {b_{i,j},c_{i,j}} }\right]^{\mathrm{ T}} \tag{9a}\\[-2pt] b_{i,j} &=\frac {\Delta \pi _{j,L}^{t-1}}{\Delta t}+\frac {ZRT\cdot \Delta \dot {G}_{i,0}^{t}}{S\cdot L_{i,j}} \tag{9b}\\[-2pt] c_{i,j} &=\frac {\Delta \pi _{i,0}^{t}}{L_{i,j}}-\frac {f\cdot ZRT\cdot \dot {G}_{st} \cdot \Delta \dot {G}_{i,0}^{t}}{4d\cdot S^{2}\cdot \pi _{st}} \tag{9c}\end{align*}
\begin{align*} &\hspace {-0.1pc}X= \\[-2pt] &\left [{ {\frac {\Delta \pi _{j,L}^{t-1}}{\Delta t}+\frac {ZRT\cdot \Delta \dot {G}_{i,0}^{t}}{S\cdot L_{i,j}},\frac {\Delta \pi _{i,0}^{t}}{L_{i,j} }-\frac {f\cdot ZRT\cdot \dot {G}_{st} \cdot \Delta \dot {G}_{i,0}^{t} }{4dS^{2}\pi _{st}}} }\right]^{\mathrm{ T}} \\[-2pt] \tag{10}\end{align*}
The discretization of the PDEs and their transformation into the form of transfer matrix clarify the parameters and physical quantities that need to be estimated in natural gas networks, and the data-driven state estimation problem is summarized as:
Input: variable sets
and\left \{{{X^{t}} }\right \} constructed from historical measurement data.\left \{{{Y^{t}} }\right \} Gain: the accurate parameters of the state matrix parameter
and the real values covered by the measurement errors in the measurement data.\hat {A}
Methodology
In the natural gas network, the historical data such as pressure at node and gas mass flow are obtained through direct measurement. Nevertheless, in the practical context, certain measurement errors will inevitably occur during the measurement process, and the measured values containing errors will cover the characteristics of the real data. To reduce the impacts of measurement errors on state estimation, a data-driven model considering measurement errors is needed to accurately estimate and detect the natural gas systems.
A. Model with Measurement Noise
In the state estimation, the pressure at node and gas mass flow are taken as direct measurements, involving the real values and measurement errors. We assume that the measurement errors are independent and Gaussian distributed, they can be defined as: \begin{align*} \Delta \pi _{j.L}^{t} &=\Delta \tilde {\pi }_{j,L}^{t} +\varepsilon _{\Delta \pi _{j,L}^{t}} \tag{11a}\\ \Delta \dot {G}_{j,L}^{t} &=\Delta \tilde {{\dot {G}}}_{j,L}^{t} +\varepsilon _{\Delta \dot {G}_{j,L}^{t}} \tag{11b}\\ \Delta \pi _{i,0}^{t}& =\Delta \tilde {\pi }_{i,0}^{t} +\varepsilon _{\Delta \pi _{i,0}^{t}} \tag{11c}\\ \Delta \dot {G}_{i,0}^{t} &=\Delta \tilde {{\dot {G}}}_{i,0}^{t} +\varepsilon _{\Delta \dot {G}_{i,0}^{t}} \tag{11d}\end{align*}
\begin{align*} Y&=\tilde {Y}+\varepsilon _{Y} \tag{12a}\\ \varepsilon _{Y} &\sim N\left ({{0,\Sigma _{Y}} }\right) \tag{12b}\end{align*}
\begin{align*} X&=\tilde {X}+\varepsilon _{X} \tag{13a}\\ \varepsilon _{X} &\sim N\left ({{0,\Sigma _{X}} }\right) \tag{13b}\end{align*}
B. Maximum Likelihood Estimation
To seek the optimum model parameters, the estimation problem with input and output noise is formulated as a maximum likelihood estimation problem, and the relationship among the indirect measurements \begin{equation*} \hat {A}=\mathop {\arg {\max }}\limits _{A} \sum _{t=1}^{N} {\log P\left ({{X^{t},Y^{t}\left |{ A }\right.} }\right)} \tag{14}\end{equation*}
\begin{align*} L&=\log P\left ({\left \{{{X^{t}} }\right \},\left \{{{Y^{t}} }\right \}\left |{ {\left \{{{\tilde {X}^{t}} }\right \},\left \{{{\tilde {Y}^{t}} }\right \}} }\right. }\right) \\ &=\sum \limits _{t=1}^{N} {\log P\left ({{X^{t},Y^{t}\left |{ {\tilde {X}^{t},\tilde {Y}^{t}} }\right.} }\right)} \tag{15a}\\ &\text {subject to}: \\ &\hspace {-0.1pc} \log P\left ({{X^{t},Y^{t}\left |{ {\tilde {X}^{t},\tilde {Y}^{t}} }\right.} }\right) \\ &=\log P\left ({{X^{t}\left |{ {\tilde {X}^{t}} }\right.} }\right)+\log P\left ({{Y^{t}\left |{ {\tilde {Y}^{t}} }\right.} }\right) \\ &=-\frac {1}{2}\left ({{X^{t}-\tilde {X}^{t}} }\right)^{\mathrm{ T}}\Sigma _{X}^{-1} \left ({{X^{t}-\tilde {X}^{t}} }\right) \\ &-\frac {1}{2}\left ({{Y^{t}-\tilde {Y}^{t}} }\right)^{\mathrm{ T}}\Sigma _{Y}^{-1} \left ({{Y^{t}-\tilde {Y}^{t}} }\right) \\ &+\log \det \left ({{2\pi \cdot \Sigma _{X}} }\right)^{-\frac {1}{2}}+\log \det \left ({{2\pi \cdot \Sigma _{Y}} }\right)^{-\frac {1}{2}} \tag{15b}\\ & X^{t}=\tilde {X}^{t}+\varepsilon _{X^{t}} \tag{15c}\\ &Y^{t}=\tilde {Y}^{t}+\varepsilon _{Y^{t}} \tag{15d}\\ & \tilde {Y}^{t}=A\cdot \tilde {X}^{t} \tag{15e}\end{align*}
\begin{align*} \left ({{\tilde {X}^{t},\tilde {Y}^{t}} }\right)&=\mathop {\arg {\max }}\limits _{\hat {X},\hat {Y}} \log P\left ({{X^{t},Y^{t}\left |{ {\hat {X}^{t},\hat {Y}^{t}} }\right.} }\right)\tag{16a}\\ \text {subject to:}\quad \hat {Y}^{t}&=A\cdot \hat {X}^{t}\tag{16b}\end{align*}
\begin{equation*} P\left ({{X^{t},Y^{t}\left |{ A }\right.} }\right)=P\left ({{X^{t},Y^{t}\left |{ {\tilde {X}^{t},\tilde {Y}^{t}} }\right.} }\right) \tag{17}\end{equation*}
\begin{equation*} \max \limits _{\hat {X}^{t},\hat {Y}^{t},A} \sum _{t=1}^{N} {\log P\left ({{X^{t},Y^{t}\left |{ {\hat {X}^{t},\hat {Y}^{t}} }\right.} }\right)} \tag{18}\end{equation*}
C. Convertion of Maximum Likelihood Estimation into Weighted Low-Rank Approximation
From subsection III-B, it can be deduced that the optimal estimation parameters can be further obtained via the search of the best estimation values of the real values. Nevertheless, according to the existing information, the log probability density in (16) is hardly to calculate directly. Consequently, it is necessary to transform the estimation problem into a solvable weighted low-rank approximation problem. Such problem transformation and reasoning compensate for the deficiency that the estimation results can not be calculated directly in practice. After the input state variables are explicitly defined in subsection III-B, the measurement errors of the corresponding input state variables are stated as:\begin{align*} \varepsilon _{b_{i,j}} &=b_{i,j} -\tilde {b}_{i,j} \\ & =\frac {\Delta \pi _{j,L}^{t-1}}{\Delta t}+\frac {ZRT\cdot \Delta \dot {G}_{i,0}^{t}}{S\cdot L_{i,j}}-\left ({{\frac {\Delta \tilde {\pi }_{j,L}^{t-1}}{\Delta t}+\frac {ZRT\cdot \Delta \tilde {{\dot {G}}}_{i,0}^{t}}{S\cdot L_{i,j}}} }\right) \\ &=f_{1} \left ({{\varepsilon _{\Delta \pi _{j,L}^{t-1}},\varepsilon _{\Delta \dot {G}_{i,0}^{t}};\Delta \pi _{j,L}^{t-1},\Delta \dot {G}_{i,0}^{t}} }\right) \tag{19}\\ \varepsilon _{c_{i,j}} &=c_{i,j} -\tilde {c}_{i,j} \\ &=\frac {\Delta \pi _{i,0}^{t}}{L_{i,j}}-\frac {f\cdot ZRT\cdot \dot {G}_{st} \cdot \Delta \dot {G}_{i,0}^{t}}{4d\cdot S^{2}\cdot \pi _{st}} \\ &-\left ({{\frac {\Delta \pi _{i,0}^{t}}{L_{i,j}}-\frac {f\cdot ZRT\cdot \dot {G}_{st} \cdot \Delta \dot {G}_{i,0}^{t}}{4d\cdot S^{2}\cdot \pi _{st}}} }\right) \\ & =f_{2} \left ({{\varepsilon _{\Delta \pi _{i,0}^{t}},\varepsilon _{\Delta \dot {G}_{i,0}^{t}};\Delta \pi _{i,0}^{t},\Delta \dot {G}_{i,0}^{t}} }\right) \tag{20}\end{align*}
\begin{align*} &\min \limits _{\hat {X}^{t},\hat {Y}^{t},A} \sum _{t=1}^{N} {\left \|{ {\left [{ {X^{t},Y^{t}} }\right]-\left [{ {\hat {X}^{t},\hat {Y}^{t}} }\right]} }\right \|}_{\Sigma ^{-1}}^{2} \tag{21a}\\ &\text {subject to:}\quad \hat {Y}^{t}=A\cdot \hat {X}^{t}, \tag{21b}\\ &\left \|{ {\left [{ {X^{t},Y^{t}} }\right]-\left [{ {\hat {X}^{t},\hat {Y}^{t}} }\right]} }\right \|_{\Sigma ^{-1}}^{2} =\left ({{X^{t}-\hat {X}^{t}} }\right)^{\mathrm{ T}}\Sigma _{X}^{-1} \left ({{X^{t}-\hat {X}^{t}} }\right) \\ &+\left ({{Y^{t}-\hat {Y}^{t}} }\right)^{\mathrm{ T}}\Sigma _{Y}^{-1} \left ({{Y^{t}-\hat {Y}^{t}} }\right) \tag{21c}\\ &\Sigma =\left [{ {{\begin{array}{cccccccccccccccccccc} {\sigma _{\varepsilon _{X^{1}}}^{2}} & & & & \\ & {\sigma _{\varepsilon _{Y^{1}}}^{2}} & & & \\ & & \ddots & & \\ & & & {\sigma _{\varepsilon _{X^{N}}}^{2}} & \\ & & & & {\sigma _{\varepsilon _{Y^{N}}}^{2}} \\ \end{array}}} }\right] \tag{21d}\end{align*}
D. Solution of the Weighted Low-Rank Approximation Problem
Aiming for the weighted low-rank approximation problem, most of the existing methods are based on eigenvalues extraction and alternating projection. These methods have high computational complexity and will be apt to fall into local optimization. Hence, the Naive Riemannian Stochastic Descent method [20] is exploited to approximate the low-rank matrix with noise in this paper. It is a gradual optimization method based on the retraction for the low-rank matrix manifold, which is in a position to map from the vector space to the manifold. And the computational complexity is not merely reduced but the search space is greatly shrunk in the mapping process. Finally, the global optimal solution is a submanifold of the Riemannian manifold. In the iteration process, determining the gradient of an objective function on the manifold is required. Thereafter, the search is carried out by constantly updating the gradient step size and projecting the gradient back to the manifold of the low-rank matrix, and the retracted gradient flow operator is employed to seek the optimum solution. The process of seeking the optimum solution is to take the search direction as the tangent vector of the manifold and implement the next iteration according to the tangent mapping.
Substituting the extended matrix composed of \begin{equation*} \left [{ {X^{t},Y^{t}} }\right]=Z^{t} \tag{22}\end{equation*}
\begin{align*} &\min \limits _{rank(Z)=r} V\left ({Z }\right) \tag{23a}\\ &V\left ({Z }\right)=\left \|{ {\hat {Z}-Z} }\right \|_{\Sigma ^{-1}}^{2} \tag{23b}\end{align*}
After the weighted low-rank approximation term \begin{align*} \nabla &=\frac {\partial V\left ({Z }\right)}{\partial Z}=\frac {\partial \text {tr}\left [{ {\left ({{\hat {Z}-Z} }\right)^{\mathrm{ T}}\Sigma ^{-1}\left ({{\hat {Z}-Z} }\right)} }\right]}{\partial Z} \\ &=-2\Sigma ^{-1}\left ({{\hat {Z}-Z} }\right) \tag{24}\end{align*}
\begin{equation*} G=-\alpha \cdot \nabla, \alpha >0 \tag{25}\end{equation*}
Afterward, we take the product of two non-unique full rank matrices \begin{equation*} \hat {Z}^{\left [{ 0 }\right]}=Z_{1} Z_{2}^{\mathrm T}, Z_{1} \in { \mathbb {R}}^{m\times r}, Z_{2} \in { \mathbb {R}}^{n\times r} \tag{26}\end{equation*}
Despite the selection of initial matrix is not unique, as the retracting gradient flow operator searches for optimization, whenever the selected matrix meets the rank constraint, it is capable of approaching the target value along the exact direction under the guidance of the iteration rules such as the orthogonal complement space of the matrix and the mapping of the matrix along the tangent direction. Hence, there is no specific requirement for the choice of initial values. Then, the pseudo-inverse matrices \begin{align*} Z_{1}^{+} &=\left ({{Z_{1}^{\mathrm T} Z_{1}} }\right)^{-1}Z_{1}^{\mathrm T}, Z_{1}^{+} \in {= \mathbb {R}}^{r\times m} \tag{27}\\ Z_{2}^{+} &=\left ({{Z_{2}^{\mathrm T} Z_{2}} }\right)^{-1}Z_{2}^{\mathrm T}, Z_{2}^{+} \in {= \mathbb {R}}^{r\times n} \tag{28}\end{align*}
The orthogonal complements of \begin{align*} B&=Z_{1}^{+} GZ_{2}^{+{\mathrm T}}, B\in { \mathbb {R}}^{r\times r} \tag{29}\\ B_{1} &=Z_{2,\bot }^{\mathrm T} G^{\mathrm T}Z_{1}^{+{\mathrm T}}, B_{1} \in { \mathbb {R}}^{(n-r)\times r} \tag{30}\\ B_{2} &=Z_{1,\bot }^{\mathrm T} GZ_{2}^{+{\mathrm T}}, B_{2} \in { \mathbb {R}}^{(m-r)\times r} \tag{31}\end{align*}
According to the orthogonal space matrix, the mappings \begin{align*} C_{1} &=Z_{1} \left ({{I_{r} +\frac {1}{2}B-\frac {1}{8}B^{2}} }\right)+Z_{1,\bot } B_{2} \left ({{I_{r} -\frac {1}{2}B} }\right), \\ C_{1} &\in { \mathbb {R}}^{m\times r} \tag{32}\\ C_{2} &=Z_{2} \left ({{I_{r} +\frac {1}{2}B^{\mathrm T}-\frac {1}{8}\left ({{B^{\mathrm T}} }\right)^{2}} }\right)+Z_{2,\bot } B_{1} \left ({{I_{r} -\frac {1}{2}B^{\mathrm T}} }\right), \\ C_{2} &\in { \mathbb {R}}^{n\times r} \tag{33}\end{align*}
\begin{equation*} \hat {Z}^{\left [{ k }\right]}=C_{1} C_{2}^{\mathrm T} \tag{34}\end{equation*}
\begin{equation*} \frac {\left \|{ {\hat {Z}^{\left [{ {k+1} }\right]}-\hat {Z}^{\left [{ k }\right]}} }\right \|_{F}}{\left \|{ {\hat {Z}^{\left [{ k }\right]}} }\right \|_{F} }< \varepsilon, \varepsilon >0 \tag{35}\end{equation*}
By means of iteration calculation, the optimum estimation of the real value
Experimental Results
In this section, the 10-node natural gas network is selected to verify the validity and feasibility of our method. The estimation results under different noise levels are observed by adding the Gaussian noise with different levels to the measurements. Moreover, the convergence process under different iteration times is provided. Thereafter, comparing the data-driven method with the Newton’s method and simulation results from PDEs.
The structure of 10-node natural gas network is shown in Figure 2. Two nodes in the natural gas network are connected with the power system, in which node 1 is the source node, the natural gas injection of this node is constant and the pressure at node 1 is constant as well. Nodes 5, 6, 9, and 10 are sink nodes, where nodes 5 and 10 are connected to the fixed natural gas loads, and nodes 6 and 9 are connected to Gas-fired Generator 2 and Gas-fired Generator 1, respectively. The electric energy generated by the gas-fired generators is input into the power system on the right.
The parameters of pipeline, nodes and the natural gas network and are shown in Tables 1, 2, and 3, respectively. Thereinto, Table 1 provides the values of standard operating parameters
A. Parameter Estimation with Different Noise Levels
For the sake of verification of effectiveness for the new model under different noise levels, the measurement noise is generated by signal to noise ratio (SNR), and the measurement values containing noise are utilized for experiments. We assume that the measurement noise is Gaussian distributed with zero mean, i.e., \begin{equation*} MSE=\frac {1}{mn}\left \|{ {A-\hat {A}} }\right \|_{F}^{2} \tag{36}\end{equation*}
As seen in Figure 3, we explicitly observe that the errors of the estimation parameters matrix gradually decrease with the increase of the iterations number.
After about 30 iterations, the estimation errors approach asymptotic values, and the accuracy decreases as the increase of the noise level. When the noise level is 50 dB, the estimation errors become the smallest and the convergence speed is the fastest. The estimation errors in the 40 dB noise level are the largest compared to that in noise levels of 45 dB and 50 dB, but the convergence is relatively stable for various noise levels. Besides, it doesn’t take too many iterations to keep errors at a low level.
B. Convergence of Different Numbers of Iteration
In addition to the estimation errors, the convergence of iterations is discussed as well. The convergence of the proposed method is evaluated via the calculation of relative errors for the new matrix after each iteration under different noise levels. More specifically, during the calculation, the relative error
In Figure 4, we can intuitively see that during the process of continuous iteration, the relative errors of
C. True Values Estimation with Measurement Noise
In the interests of further validation for effectiveness of the proposed method, the data-driven method is compared with the Newton’s method, and tested in the 10-node natural gas network in Figure 1 to observe the response change of pressure at nodes under 50 dB noise level. In this paper, MATLAB is used to simulate the dynamic process of pressure in the 10-node natural gas network, and the performance of the proposed state estimation method is illustrated based on the simulation results. MATLAB/Simulink can effectively simulate and analyze the operation of the systems. In [25], MATLAB was used to simulate the hybrid electricity-gas systems to analyze the role of micro-turbines in the electricity-gas systems. In [26], the dynamic behavior of gas pipe network was simulated via MATLAB/Simulink, and compared the simulation results with the original model, proving that the MATLAB tool is reliable. In the 10-node natural gas network, when the natural gas mass flow demand of the gas-fired generators changes, the pressure at node 6 and node 9 connected to the gas-fired generators will change with the fluctuation of natural gas mass flow. Figure 5 gives the estimated results of two nodes with respect to the pressure change, thereinto, when the gas demand of Gas-fired Generator 2 increases by 30%, the pressure at node 6 drops by 15.4 kPa. At first, the pressure drops dramatically, and then incrementally tends to be stable, reaches the steady state in the last, as shown in Figure 5 (a). We can visibly see that the data-driven method gives an estimation result closer to the result from simulation than the Newton’s method.
Comparison of estimated and simulated results, (a) Pressure at node 6, (b) Pressure at node 9.
Additionally, even if the pressure change is continuous, the data-driven method is able to provide accurate estimation results. As seen in Figure 2, when the natural gas demand of Gas-fired Generator 1 exhibits a continuous change, the pressure at node 9 changes accordingly with the fluctuation of natural gas, therefore the pressure change is also continuous. On the other hand, the pressure at node 9 is estimated, and the relevant results are given in Figure 5 (b). Apparently, there is an obvious deviation between the Newton’s method and the simulation result from PDEs. The data-driven method conquers deficiency of large deviation and can accurately track the pressure change trend at node, this is in stark contrast to the Newton’s method. Table 4 provides the RMSE and MAPE of the data-driven method and Newton’s method. It can be seen that the RMSE and MAPE of the data-driven method are less than those of the Newton’s method. The RMSE of data-driven method achieves 0.2268 and the MAPE reaches 1.63%.
It turns out that the data-driven method can accurately capture the characteristics of data change whether the pressure at node changes in a single step or continuously, since in the data-driven approach, the orthogonal complement space of the matrix and the mapping of the matrix along the tangent direction together constitute the iteration descent condition, which restricts the iteration process in a variety of aspects to ensure the stability and accuracy in the descent process. It is worth noting that the noise comes from the measurement error in the actual environment and is not induced by the change of the internal operation mode of the system.
In Figure 6, the relationship between the initial errors before iteration and the final errors after iteration is given, it is obvious that in spite of different choices of initial values result in different initial errors, the final errors after iterations are almost the same as that in the same noise level. Thus, it sufficiently indicates that the initial error has little influence on the final error result, and the final error is mainly affected by the noise level. This is mainly due to the relatively stable iterative process of the Naive Riemannian Stochastic Descent method, which searches by constantly updating the gradient step size and projecting the gradient back to the manifold of the low rank matrix, and uses the retracted gradient flow operator to find the best. In difference to the model in this paper, the Newton’s method is sensitive to the selection of initial values. If the selection of initial values is unreasonable, it is apt to fall into local optimum. Furthermore, the Newton’s method calculates in sole space, only depends on the descent direction to find the optimum, and lacks more auxiliary conditions to plan and guide the optimal descent path, consequently, the errors generated after each iteration is larger than that of the data-driven method. Meanwhile, in the iteration process, as the increase of the number of iterations, the errors may inevitably accumulate and transmit, thereby the errors of estimation results for Newton’s method are somewhat large. In Table 5, the execution time, advantages, and disadvantages of the data-driven method and Newton’s method are compared. The execute time of the two methods is basically the same. The data based method does not require filtering and denoising, and retains the information of the real data. The selection of the initial value has a relatively small impact on the results. The Newton’s method needs to preprocess the data to reduce noise, and the selection of initial value will affect the results after iterations.
The relationship between the errors before iteration and the errors after iteration.
D. Parameter Estimation with Different Numbers of Training Samples
Although the noise level could directly affect the accuracy of the estimation results, the number of training samples will affect the results to a certain extent. In order to observe the relationship between the number of training samples and the estimation results, the noise level at 50 dB is chosen to conduct experiments under different numbers of training samples, and then the statistical results of 50 independent repeated experiments are given, the detailed results are shown in Figure 7.
We can intuitively and explicitly observe that the difference between the upper and lower limits of the errors of the data-driven method in this paper is small, and the errors incrementally decrease as the increase of the number of training samples. During this period, despite there exist fluctuates owing to the noise randomness, the overall errors level exhibits a downward trend, and the quartile is constantly close to the median, thus the errors range maintains a ongoing and stable narrowing trend.
In comparison to the proposed model, the errors range of Newton’s method are somewhat larger, there are more outliers and the performance is unstable. The overall errors decrease as the increase of the number of samples, whereas no matter whether the number of training samples is large or small, the errors of Newton’s method are still larger than that of data-driven method. Meanwhile, the deviation between the upper and lower limits caused by Newton’s method is too large to provide reliable performance, therefore, the proposed model outperforms Newton’s method and has more superiority.
Conclusion
As the application scope of natural gas extends constantly, effectively and accurately mastering the operation state of natural gas system is a crucial content of natural gas system operation planning. Taking the gas mass flow characteristics of natural gas system into account, a data-driven method is developed to address the problem of state estimation for natural gas network with measurement noise. By the combination of maximum likelihood estimation and weighted low-rank approximation, the state estimation problem is convert into a weighted low-rank approximation optimization problem. In the whole state estimation process, there are no needs for filtering, denoising, and special processing of noise, this retains the complete information of real data. The Naive Riemannian Stochastic Descent method is employed to handle the problem. The search space is shrunk to an orthogonal complement space, and the search direction is determined as a mapping along the tangent direction, not only the search range is reduced, but also the selection of initial value has little influence on the iteration results. The parameters of state estimation is capable of updating in time according to the changes of network structure. The performance is verified in a 10-node natural gas network, the experiment results imply that the data-driven model is capable of providing accurate and reliable state estimation results under different noise levels, the abilities of the proposed model to effectively depict the system state and to be less affected by the measurement errors are validated. The data-driven method is superior to the Newton’s method, with RMSE achieving 0.2268 and MAPE achieving 1.63%.
In the future work, the distributed computing method for large-scale natural gas networks will be explored in detail. Moreover, the impacts of uncertain data (i.e., interval data, incomplete data, and so forth.) on state estimation are worthy of further research.