Loading [MathJax]/extensions/MathMenu.js
An Improved Unscented Particle Filter Method for Remaining Useful Life Prognostic of Lithium-ion Batteries With Li(NiMnCo)O2 Cathode With Capacity Diving | IEEE Journals & Magazine | IEEE Xplore

An Improved Unscented Particle Filter Method for Remaining Useful Life Prognostic of Lithium-ion Batteries With Li(NiMnCo)O2 Cathode With Capacity Diving


The schematic diagram of the proposed method with mainly improved parts colored in orange.

Abstract:

An improved method for the remaining useful life (RUL) prognostic of Lithium-ion batteries with Li(NiMnCo)O2 cathode using improved unscented particle filter (UPF) is pro...Show More

Abstract:

An improved method for the remaining useful life (RUL) prognostic of Lithium-ion batteries with Li(NiMnCo)O2 cathode using improved unscented particle filter (UPF) is proposed with respect to capacity diving in later capacity degradation curve. Key points of this paper are: (1) An appropriate empirical model for the situation as the most contributive work, is put forward as an alternative to the widely used UPF models, and the prediction performance is respectively verified by least square fitting and the improved UPF; (2) Systematic noise in Gamma distribution is attempted in state space equations of the proposed method, so as to avoid potential shape shifting of the prediction curve after sampling the particles with Gaussian noise, for model parameters could get zero-crossed; (3) With training data preprocessed considering the capacity recovery phenomenon concisely, the residual error and root mean square error of fitting could get further reduced, as a supplement to traditional treatments like smoothing, thus relieving the sensitivity of data-driven methods to data by enhancing quality. Validations are implemented by applying the proposed method to the battery data by conducting cycle aging tests under different working conditions, where improved approximation and prediction performance can be obtained.
The schematic diagram of the proposed method with mainly improved parts colored in orange.
Published in: IEEE Access ( Volume: 8)
Page(s): 58717 - 58729
Date of Publication: 04 March 2020
Electronic ISSN: 2169-3536

Funding Agency:


CCBY - IEEE is not the copyright holder of this material. Please follow the instructions via https://creativecommons.org/licenses/by/4.0/ to obtain full-text articles and stipulations in the API documentation.
SECTION I.

Introduction

Lithium-ion batteries are widely applied for excellent performance in various aspects, and the safety and reliability problems of Lithium-ion batteries have aroused a lot of attention [1]. The prediction of the remaining useful life (RUL) of Lithium-ion batteries plays an important role in state of health (SOH) estimation, providing evidence for decision makers to avoid malfunctions thus running the system appropriately and reducing loss [2].

A. Review of the Approaches

As reviewed in [1], two main concepts concerning the prediction of the RUL of Lithium-ion batteries are illustrated. Employing the observation in a past period of cycles, the physics of failure methods as the first concept critically relies on absolute accuracy of the life time model and measurement [3], [4], and no recalibration considering present SOH features can be seen. Besides, on account of the incompletion of precise description of sophisticated electrochemical behavior, material properties, constructions, and fading mechanism of Lithium-ion batteries [4], the other concept known as data-driven methods is preferable by analyzing the characteristics featuring SOH, like capacity and impedance, so as to track the degradation until they reach the threshold value defined as the end of life (EOL) point.

Under the second circumstance of adopting data-driven methods, frameworks of great simplicity and efficient implementation are utilized for aging predictions of Lithium-ion batteries. There are many completed case studies in which good predicting performance can be seen by typically using fitting [6], [20], machine learning (Neural Network (NN) [7], Supporter Vector Machine (SVM) [8], and Relevance Vector Machine (RVM) [9]), signal processing [10], [11], and statistical regression based on probabilistic methods (Auto Regressive Model Average [10], [12], Grey Model (GM) [13], [14], and process regressions [15]–​[17], [27]) etc. Explicitly, researchers are provided with informative results in an analytical probabilistic form when demonstrating the problems with uncertainty and nonlinearity by probabilistic analysis methods, especially methods based on Bayesian theories. So the difficulties in expressing accurate predictions are resolved, for the outcome in the form of expectations and confidence intervals promisingly assists in making wise decisions. And it has become an inevitable trend for researchers to combine the abovementioned methods to make various algorithm improvements.

The accuracy of extended Kalman filter (EKF) and unscented Kalman filter (UKF), and sequential Bayesian Monte Carlo methods, like several particle filters, has been theoretically evaluated by the Kullback–Leibler and $\chi ^{2}$ information measure respectively [18]. Corresponding to the RUL prediction with non-linear capacity degradation, one of the sequential Monte Carlo methods combining UKF and particle filter (PF) unscented particle filter (UPF) is outstanding for the relatively highest approximation at the expense of computational cost.

The main idea of PF can be generally thought as using random particles and their weights to approximate the posterior probability distributions of the system states, which is realized through the procedure of sequential importance sampling, suffering possible problems as sample degeneracy and particle impoverishment [29], [31]. Saha et al. [12] revealed that, PF has better accuracy than ARIMA, EKF, RVM, and SVM by a comparative study of the algorithm performance for RUL predictions. Besides, Miao et al. [19] made several representative works to verify the practical performance of UPF in RUL prediction, compared with PF in a lot of case studies, with the vast majority of the results showing the optimal choice is UPF among this category. Except for the UKF used as the importance function in PF to make up for the demerits, others like spherical cubature particle filter has also been proposed to provide an importance function for particle filter for battery prognostics by Wang et al. [31], showing better performance than PF in RUL prediction, of which determination of the sigma points is similar to UPF.

Since UPF implementation makes sure that, parameters in the model are only updated up to the last one moment, good robustness is theoretically guaranteed in the regression process. As a consequence, a couple of initial training data of unstable quality caused by the activation effect of the batteries will exert promptly decreasing influence with the process of subsequent UPF iteration. That is to say, now that training data are divided into different sections with variant systematic noise in RUL prediction when capacity diving occurs, training information with poor quality in very early periods could be gradually vanishing in later loops of UPF, leading to superior performance of local iteration [19], and with further considering the computational cost and available training information quantity, UPF is chosen rather than artificial intelligence methods in the paper to deal with the problem.

A dynamic state space system model used in UPF usually consists of a state transition model and a measurement model. The state transition function plays a critical role in particle sampling, and the measurement equation, regularly known as the empirical model with a noise term, is required to be sufficiently precise in actual observation, and concise thus not over-fitting.

With respect to the measurement model part, intuitively speaking, traditional combinations of simplex exponential function terms and power function terms in the measurement equation used in existing study [5], [8], [23], [24] are too smooth to describe rapidly decreasing tendency in later capacity degradation curve. There are quantities of cases [20] in which, very long flat capacity curve of early training data with slow degradation provides the identification of model parameters representing later plunging capacity degradation with very insufficient training information, thus making difficulties in determining how much training information is necessarily used at least. Besides, an immense quantity of training information is required using the method in [20]. Since these prediction works using only capacity information are conducted absent of the consideration of the training data with capacity diving, there is a need to verify the performance of various models to find a better empirical model as a specific measurement equation suitable for the RUL prediction of capacity diving problem.

As for the state transition model part, the procedure of sampling, as well as resampling, is of great significance in UPF algorithm study [22], of which the state noise usually is in the form of Gaussian distribution for each model parameter dimension. However, assuming distinct plunging tendency appears in the later capacity degradation curve, even if UPF could follow the extent of changing tendency by varying the covariance matrix in the iteration process, the systematic noise could be constantly changing in an unsatisfying response speed around and after the inflection point. As a result, poor convergence speed of the algorithm demands abundant training information, which could lead to very frustrating prediction performance. Meanwhile, a wide value range to determine among noise parameters sacrifices the robustness of the stability and validity of the model for rapid convergence of the algorithm, because irrational shape shifting could occur in subsequent prediction for parameter zero-crossing takes place among the parameters during particle sampling.

Moreover, data quality enhancement is of vital significance in that data-driven methods are sensitive to data quality to a large extent. In addition to common preprocessing strategies, the capacity self-recharge brought by performance tests during aging tests should be taken into account. After quantities of work done to data preprocessing for a long time, recently in 2019, Pang et al. [25] made the problem resolved effectively using a combination of wavelet decomposition and artificial intelligence methods, and made RUL predictions to compare the performance of three proposed models showing the WDT–NARNN method among them to be reasonable and suitable. Besides, Zhao et al. [26] verified the degradation process by applying RVM and GM methods to the proposed capacity regeneration and normal degradation model, developing a hybrid method for RUL estimation, and Xu et al. [30] adopted Wienner process methods to make a successful RUL prediction, with the consideration of capacity relaxation effects. The essential target of establishing a capacity recovery function is to demonstrate the impacts brought by the capacity recovery phenomenon after battery performance tests on the empirical model, and it is inevitable that training data preprocessed without considering the factor of capacity recovery phenomenon will bring unsupported information to actual measurement equation of the algorithm, leading to reduction in the model precision and stability of the regression process. Since there are many cases in which, when to conduct experimentally designed performance tests is already known, therefore simplifications could be made by merely verifying several fading characteristics based on current studies.

B. Contribution of the Paper

1) Data Quality Enhancement

Brief steps are stated as follows.

Firstly, obvious outliers are recognized and eliminated by statistical methods. Next, a further simplified capacity recovery function $T\left ({x }\right)$ is developed, of which the identification of parameter values becomes the major contribution of the paper to the part of data quality enhancement. Then Savitzky-Golay filtering is implemented to measurement noise smoothing, because of its better processing performance compared to the widely used moving average filter [2].

2) Improvement of the State Space Equations

The contributions of the paper to the state space equations of the model are divided into 2 following parts:

  1. A modified empirical capacity fading model is developed in the paper for a specific description of capacity diving, with comparison cases made to verify RUL prediction performance in both the early capacity curve and the whole curve;

  2. Chances are that some terms in the measurement equation represents the early characteristics, and other terms have close contact with the later sharp decline. So considering the shape factor of the curve derived from measurement equations, the state transition equations are improved by sectioned treatment of sampling using inflection point checking technique and noises in Gamma distribution.

SECTION II.

Improvements

A. Simplified Capacity Recovery Function

By merely verifying a couple of fading characteristics as in [26], a further simplified $T\left ({x }\right)$ for preprocessing data efficiently is developed as (1) to demonstrate the relationship between the measurement equation and the observed capacity data with experimental designed performance tests information already known, \begin{align*} \begin{cases} H\left ({x }\right)=C\left ({x }\right)-T\left ({x }\right) \\ T\left ({x }\right) = \sum \nolimits ^{n} {\rho _{n}e}^{- \frac {1}{\tau _{n}} (x-t_{n})} \boldsymbol {\cdot 1}\left ({x-t_{n} }\right) +Q_{cal} \\ \end{cases}\tag{1}\end{align*} View SourceRight-click on figure for MathML and additional features. where $H\left ({x }\right) $ denotes the empirical model as a main part of the measurement function in the methods, $C\left ({x }\right)$ denotes experimental capacity degradation curve, $n$ denotes the total number of times of performance tests, $\rho _{n}$ denotes the peak value of the $n$ th performance test, $\tau _{n}$ denotes the time constant of the $n$ th performance test, $t_{n}$ denotes the moment of the $n$ th performance test known by designed experimental procedure, and $\mathbf {1}\left ({x-t_{n} }\right)$ denotes the unit step response function with the jump taking place at moment $t_{n}$ . Considering aging tests in lab experimental conditions, the capacity recovery of calendar loss term $Q_{cal}$ could be negligible due to short cyclic interval periods. The parameters could be obtained by the procedure in Table 1.

TABLE 1 Acquirement of the Parameters of ${T}\left({x}\right)$
Table 1- 
Acquirement of the Parameters of 
${T}\left({x}\right)$

Residual error (RSE) and root mean square error (RMSE) of fitting training data using empirical models are considerably reduced, which is mainly attributed to well-illustrated impacts on the capacity degradation curve brought by performance tests during cyclic aging tests.

B. State Space Equations

1) The Measurement Equation

For the empirical model part, Bayesian Information Criterion (BIC) is commonly utilized as a general indicator in model performance evaluation, with a smaller BIC value indicating the better balanced performance in approximation and complexity of the model shown as (2), \begin{equation*} BIC= Kln\left ({n }\right)-2ln\left ({f\left ({y\thinspace \vert \thinspace {\theta _{k}}}\right) }\right)\tag{2}\end{equation*} View SourceRight-click on figure for MathML and additional features. where $f\left ({y\thinspace \vert \thinspace {\theta _{k}}}\right) $ denotes the maximum likelihood function indicating the accuracy, $n$ denotes the quantity of the data records, and $K$ denotes the quantity of the parameter of the polynomial fitting indicating the complexity. In order to obtain a model with small BIC value, it is preferable that $K$ value is small implying the model not over-fit. Exponential function terms are splendid for their concise, smoothness and diversity of shape, and the simplex exponential function is Taylor expanded in (3), \begin{equation*} e^{x}=1+\frac {x^{1}}{1!}+\frac {x^{2}}{2!}+\cdots =\sum \nolimits _{n=0}^{\infty } \frac {x^{n}}{n!}\tag{3}\end{equation*} View SourceRight-click on figure for MathML and additional features.

Generally a couple of expanded polynomial terms as the neighborhood of the Taylor expansion approximation are of practical use, for which the number of exponential-shaped terms involved in the model to discuss in the paper is not more than 2, because severe punishment of BIC will take place if $K$ is empirically larger than 7.

The double exponential function has been put forward as the most widely used model by far [5] shown as (4), \begin{equation*} H\left ({x }\right)={a_{1}e}^{b_{1}x}+{a_{2}e}^{b_{2}x},\quad x\in N^{+}\tag{4}\end{equation*} View SourceRight-click on figure for MathML and additional features. and a single exponential function in [23] is shown as (5), \begin{equation*} H\left ({x }\right)=\beta _{1}+{\beta _{2}e}^{\frac {\beta _{3}}{x}},\quad x\in N^{+}\tag{5}\end{equation*} View SourceRight-click on figure for MathML and additional features. despite BIC of an single exponential model could take advantage of reduced quantity of parameters, the observation error is enormously large to describe the capacity degradation curve validated in the abovementioned work. As for polynomial function terms, a model with such a term to describe the decline empirically has been put forward in [8], \begin{equation*} H\left ({x }\right)={a_{1}e}^{b_{1}x}+{a_{2}x}^{b_{2}},\quad x\in N^{+}\tag{6}\end{equation*} View SourceRight-click on figure for MathML and additional features. where a large integer is selected as the value of $b_{2}$ , as fast recession later in the data is judged by experience, otherwise a litte integer is chosen.

Models with other complicated forms of terms [24], like the double Gaussian shown as (7), \begin{equation*} H\left ({x }\right)={a_{1}e}^{\frac {{(x-\mu _{1})}^{2}}{\gamma _{1}}}+{a_{2}e}^{\frac {{(x-\mu _{2})}^{2}}{\gamma _{2}}},\quad x\in N^{+}\tag{7}\end{equation*} View SourceRight-click on figure for MathML and additional features. the exponential combined with the multi-polymal, \begin{equation*} H\left ({x }\right)={a_{1}e}^{b_{1}x}+\sum \nolimits _{i=2}^{n} {a_{i}x}^{b_{i}},\quad x\in N^{+},~n\ge 3\tag{8}\end{equation*} View SourceRight-click on figure for MathML and additional features. and etc. are excluded from the discussion because of either large $K$ value in BIC or particular requirements of scarce applied conditions.

Combined with part of the abovementioned models, and inspired by the piecewise based attempts, the empirical model used in the measurement equation is proposed as (9), \begin{equation*} H\left ({x }\right)=C\left ({x }\right)-T\left ({x }\right)={ae}^{\frac {b}{x}}+{ce}^{dx},\quad x\in N^{+}\tag{9}\end{equation*} View SourceRight-click on figure for MathML and additional features. where the incoming moment and sharpness degree of the capacity diving in the later curve are indicated by later stage term $ae^{\frac {b}{x}}$ , and initial capacity is indicated by $c$ with the extent of early degradation indicated by $d$ in the early stage term ${ce}^{dx}$ .

2) The State Transition Function

The state transition function used in existing study is regularly in the form as (10), \begin{align*} x_{i,k}=F\cdot x_{i,k-1}+w_{i},\quad w_{i} \sim N_{i}\left ({0,\sigma _{i}^{2} }\right),~i=1,2,\ldots,n \\{}\tag{10}\end{align*} View SourceRight-click on figure for MathML and additional features. where $F$ is the state transfer matrix, normally expressed in a diagonal form with constants like 1, and $w_{i}$ is the systematic noise in Gaussian distribution $N_{i}$ . Generally, as the selected methods are applied in Gaussian noise conditions, the iteration of the state transition function is composed of the parameter value of the last moment and a 0-mean Gaussian noise term for each parameter dimension, and when particles are sampled, $\sigma _{i}^{2}$ of relevant terms can be updated by corresponding UKF parts after initialization.

Early data of the training data set reflects the dominat effect caused by the early stage term ${ce}^{dx}(c>0,d< 0)$ in the measurement equation, and little information related to the later period term $ae^{\frac {b}{x}}\left ({a< 0,b< 0 }\right)$ is referred to, which is attributed to the different sensitivity of parameters at different moments. In [29], PF methods are adopted in GA algorithm in the process of resampling, thus influencing the passing of the parameters between different moments. Therefore, it is available that time-varying modulation is applied to the covariance matrix of state functions, with the form of state noise distribution taken into account.

Choosing Gaussian noise is appropriate to satisfy sampling requirement of the UKF algorithm, and relieving the non-linearity but the variances in different dimensions of the equation are still relatively independent in (10), which is supposed to be in the form of covariance matrix. Furthermore, shape shifting of the prediction curve could occur as it is inevitable for particle sampling by Gaussian noise to meet zero crossing conditions—the possible value range of some of the parameters becomes very wide, generating large distribution variances to sample the particles with insufficient training information, of which the distribution covers the zero point in early iterations. Consequently, the iterative performance of the measurement equation like double exponential function is severely affected with questionable correctivity, so in this case sampling parameters have to be set small enough to limit the diversity and the range of particles. Besides, when the terms representing different stages are determined, empirical evidence and 95% fitting intervals of training data could help to judge the sampling parameters. Finally, three basic characterictics of Gamma distribution $Ga\left ({\alpha,\beta }\right)\left ({\alpha,\beta >0 }\right)$ make it appropriately applied as a solution to the situation, \begin{align*} X\sim&Ga\left ({\alpha,\beta }\right) \left ({\alpha,\beta >0 }\right) \\ s.t. ~f\left ({X }\right)=&\frac {\beta ^{\alpha }}{\Gamma \left ({\alpha }\right)}X^{\alpha -1}e^{-\beta X}, \quad X>0\tag{11}\end{align*} View SourceRight-click on figure for MathML and additional features. where $f\left ({X }\right)$ is the probability distribution function (pdf) of $X, \Gamma \left ({\alpha }\right)$ is the Gamma function of parameter $\alpha $ ,

  1. the domain of definition of $Ga\left ({\alpha,\beta }\right) $ is non-negative, and the shape shifting is mainly aroused by signal changing of parameters when Gaussian noise is utilized to sample the particles, so using noises in Gamma distribution makes sure that the signs of the parameters remains unchanged during iteration.

  2. $E(X)=\alpha \cdot \beta ^{-1},var\left ({X }\right)=\alpha \cdot \beta ^{-2} $ , which makes it available to figure out the value of $\alpha,\beta $ by making the conversion from parameter values of Gaussian noise calculated by the algorithm easily.

  3. when $\alpha $ and $\beta $ draw relatively large values, Gamma distribution almost tends to be Gaussian distribution, so that the basic operation requirement of the UKF part in non-linear problems is satisfied to a large extent.

The shape shifting could be avoided by making use of Gamma distribution, but computational cost is ascending at the same time. As an alternative, resampling methods demands regularization repeatedly, or increased workload by more initial particles and longer iteration procedure using Gaussian noise with large variance to meet the supposed range.

So it is urgent to apply different scale of the distribution to sample within different stages, and the proposed state equations are as follows, \begin{align*} x_{i,k}\!=\!F \cdot x_{i,k-1}\!+\!cov\left ({ukf\left ({x_{i,k-1} }\right) }\right)\cdot \rho _{n},\quad i\!=\! 1,2,\ldots, n\!\! \\{}\tag{12}\end{align*} View SourceRight-click on figure for MathML and additional features. where $\rho _{n}$ denotes the scale adjusting matrix, $cov$ ($\cdot $ ) denotes the covariance matrix determined by the calculation in the UKF part. Before the inflection point, $\rho _{n}$ could be a diagonal matrix of constants like 1. As for the situation after the point, $\rho _{n}$ could be a diagonal matrix to adjust the speed of iteration, \begin{align*} \rho _{n}=\left [{ {\begin{array}{cccc} \rho _{e1k} & & & \\ & \rho _{e2k} & & \\ & & \rho _{e3k} & \\ & & & \rho _{e4k}\\ \end{array}} }\right]\tag{13}\end{align*} View SourceRight-click on figure for MathML and additional features. where $\rho _{e1k}, \rho _{e2k}$ are enormously larger than $\rho _{e3k}, \rho _{e4k}$ in terms of the order of magnitudes indicating the changing of the dominant role of the part of measurement equation in the iteration after the inflection point, like $\rho _{e1k}=1, \rho _{e2k}=100, \rho _{e3k}=0.1, \rho _{e4k}=0.01$ , which is used in the iteration of verifying the performance of the algorithm under 25°C 1C1C condition, and slightly adjusted to $\rho _{e1k}=1, \rho _{e2k}=50, \rho _{e3k}=0.01, \rho _{e4k}=0.01$ to make it suitable for RUL prediction under 35°C 1C1C condition.

Determined by the 95% confidence intervals of the fitting parameters, and considering the shape of the early term of the empirical model and the measurement noise, the inflection point is identified by a $3\sigma $ -interval criterion using filtered differential capacity curve.

It is noteworthy that, the inflection point is NOT the same as the true capacity diving point shown in Fig. 1. the inflection point determined by $3\sigma $ -interval criterion is the 478th, and the true capacity degradation point is found to be the 499th. This could be explained by that the inflection point is identified by training data, and the lower the capacity of the EOL point is set manually(usually 80% of initial capacity), the later the true capacity diving point is likely to appear. However in the cases discussed in this paper, the inflection point is found before the true capacity diving point, which makes it feasible to predict the capacity diving tendency using a little information after the inflection point before capacity diving appears. As an example, cyclic capacity degradation data up to 484 cycles under 45°C 1C1C condition are enough to train the proposed algorithm, with an accurate prediction result obtained.

FIGURE 1. - (a)Raw capacity data curve, and (b)Differential value of capacity degradation curve with 
$3\sigma $
-interval inflection point identification under the 45°C1C1C condition.
FIGURE 1.

(a)Raw capacity data curve, and (b)Differential value of capacity degradation curve with $3\sigma $ -interval inflection point identification under the 45°C1C1C condition.

More specifically, the inflection point reflects the changing of the dominant role of the part of measurement equation in the iteration, and the capacity diving point is defined as the most valuable point to determine in practice, the tangent line of which is parallel to the linkage of the initial point and the EOL point as in Fig. 1 for instance. In consequence, the inflection point in this study could imply a significant beginning of another prediction, with the EOL point as the target of prediction.

SECTION III.

RUL Prediction Framework Based on Improved UPF

The general UPF framework is introduced in [19], and the improved UPF steps effectively used in this paper is introduced with a flowchart in Fig. 2 and as follows,

FIGURE 2. - Flowchart of the improved UPF proposed in the paper with the improved parts colored in orange.
FIGURE 2.

Flowchart of the improved UPF proposed in the paper with the improved parts colored in orange.

State space equations are established with (12) and (9) combined in (14), \begin{align*} \begin{cases} x_{k}=f(x_{k-1},v_{k-1}) \\ z_{k}=h(x_{k},n_{k}) \\ \end{cases}\tag{14}\end{align*} View SourceRight-click on figure for MathML and additional features. where $x_{k}$ denotes the state of the system in moment k, $z_{k}$ denotes the observed value, $v_{k-1}$ denotes the systematic noise, and $n_{k}$ denotes the measurement noise.

Step 1 (Initialization):

Generate $N_{x}=200$ initial random particles $\left \{{x_{0} }\right \}_{i=1}^{N_{x}}$ , where $x_{0}=\left [{ a_{0}\,\,b_{0}\,\,c_{0}\,\,d_{0} }\right]^{T},c_{0} >0, a_{0},b_{0},d_{0} < 0$ as the fitting parameters of the training data up to the current point with the same weight ${1/N}_{x}$ for each particle.\begin{align*} \overline x_{0}=&E(x_{0}) \tag{15}\\ P_{0}=&E[\left ({x_{0}-\overline x_{0} }\right)(x_{0}-\overline x_{0})^{T}]\tag{16}\end{align*} View SourceRight-click on figure for MathML and additional features.

Step 2 (State Matrix and Covariance Matrix):\begin{align*} x_{0}^{a}=&[x_{0}^{T} ~0 ~0]^{T} \tag{17}\\ P_{0}^{a}=&\left [{ {\begin{array}{ccc} P_{0} & 0 & 0\\ 0 & Q_{0} & 0\\ 0 & 0 & R_{0}\\ \end{array}} }\right]\tag{18}\end{align*} View SourceRight-click on figure for MathML and additional features.

Step 3 (Unscented Transformation):\begin{align*} x_{k}^{a}=&[x_{k}^{T} v_{k}^{T} n_{k}^{T}]^{T} \tag{19}\\ P_{k}^{a}=&\left [{ {\begin{array}{ccc} P_{k} & 0 & 0\\ 0 & Q_{k} & 0\\ 0 & 0 & R_{K}\\ \end{array}} }\right] \tag{20}\\ x_{k}^{a}=&[x_{k}^{T} v_{k}^{T} n_{k}^{T}]^{T} \tag{21}\\ x_{k-1}^{x}=&[x_{k-1}^{a} x_{k-1}^{a}+\eta \sqrt {P}_{k-1}^{a} x_{k-1}^{a}-\eta \sqrt {P}_{k-1}^{a}]\tag{22}\end{align*} View SourceRight-click on figure for MathML and additional features. where \begin{align*} \eta=&\sqrt {n+\lambda } \tag{23}\\ \lambda=&\alpha ^{2}\left ({n+k }\right)-n, \tag{24}\\ x_{k-1}^{a}=&[x_{k-1}^{x} x_{k-1}^{v} x_{k-1}^{n}]^{T} \tag{25}\\ W_{0}^{\left ({m }\right)}=&\frac {\lambda }{n+\lambda } \tag{26}\\ W_{0}^{\left ({c }\right)}=&\frac {\lambda }{n+\lambda }+1-\alpha ^{2}+\beta \tag{27}\\ W_{i}^{\left ({m }\right)}=&W_{i}^{\left ({c }\right)}=\frac {\lambda }{2(n+\lambda)},\quad i=1,2,\ldots,2n\tag{28}\end{align*} View SourceRight-click on figure for MathML and additional features. where $n$ denotes the quantity of the parameter dimension and $W$ denotes the weights of the mean and covariance. It is assigned in the work that $\alpha =1,\beta =0,\lambda =2$ .

Step 4 (Parameters Update From Moment k-1 to k):\begin{align*} x_{k\vert k-1}^{x}=&f(x_{k-1}^{x},x_{k-1}^{v}) \tag{29}\\ \overline x_{k\vert k-1}=&\sum \nolimits _{i=0}^{2n} W_{i}^{\left ({m }\right)} x_{i,k\vert k-1}^{k} \tag{30}\\ P_{k\vert k-1}=&\sum \nolimits _{i=0}^{2n} W_{i}^{\left ({c }\right)} \left [{ x_{i,k\vert k-1}^{k}-\overline x_{k\vert k-1} }\right] \\&\times \,\left [{ x_{i,k\vert k-1}^{k}-\overline x_{k\vert k-1} }\right]^{T} \tag{31}\\ Z_{k\vert k-1}=&h(x_{k\vert k-1}^{x},x_{k\vert k-1}^{n}) \tag{32}\\ \overline Z_{k\vert k-1}=&\sum \nolimits _{i=0}^{2n} W_{i}^{\left ({c }\right)} Z_{i,k\vert k-1} \tag{33}\\ P_{Z_{k\vert k-1}Z_{k\vert k-1}}=&\sum \nolimits _{i=0}^{2n} W_{i}^{\left ({c }\right)} \left [{ Z_{i,k\vert k-1}-\overline Z_{k\vert k-1} }\right] \\&\times \,\left [{ Z_{i,k\vert k-1}-\overline Z_{k\vert k-1} }\right]^{T} \tag{34}\\ P_{x_{k\vert k-1}Z_{k\vert k-1}}=&\sum \nolimits _{i=0}^{2n} W_{i}^{\left ({c }\right)} \left [{ x_{i,k\vert k-1}^{x}-\overline x_{k\vert k-1} }\right] \\&\times \,\left [{ Z_{i,k\vert k-1}-\overline Z_{k\vert k-1} }\right]^{T} \tag{35}\\ K_{k}=&P_{Z_{k\vert k-1}Z_{k\vert k-1}}P_{x_{k\vert k-1}Z_{k\vert k-1}}^{-1}\tag{36}\end{align*} View SourceRight-click on figure for MathML and additional features. where $K_{k}$ is the filter gain used in the Kalman filter part, \begin{align*} \overline x_{k}=&\overline x_{k\vert k-1}+K_{k}\left ({Z_{k}-\overline Z _{k\vert k-1} }\right) \tag{37}\\ \hat {P}_{k}=&P_{k\vert k-1}-K_{k}P_{Z_{k\vert k-1}Z_{k\vert k-1}}K_{k}^{T}\tag{38}\end{align*} View SourceRight-click on figure for MathML and additional features.

Step 5 (Sampling):\begin{align*} \hat { x } _ { k } ^ { i } \sim & q \left( x _ { k } ^ { i } | x _ { k - 1 } ^ { i } , z _ { k } \right) \hat { P } _ { k } = \hat { P } _ { k } * \rho _ { n } \\ & \Longrightarrow \begin{cases} G a \left( \bar { x } _ { k } ^ { i ^ { 2 } } \cdot \hat { P } _ { k } ^ { i ^ { - 1 } } , \bar { x } _ { k } ^ { i } \cdot \hat { P } _ { k } ^ { i ^ { - 1 } } \right) , & \bar { x } _ { k } ^ { i } \gt 0 \\ - G a \left( \bar { x } _ { k } ^ { i ^ { 2 } } \cdot \hat { P } _ { k } ^ { i ^ { - 1 } } , - \bar { x } _ { k } ^ { i } \cdot \hat { P } _ { k } ^ { i ^ { - 1 } } \right), & \bar { x } _ { k } ^ { i } \lt 0 \end{cases} \tag{39} \end{align*} View SourceRight-click on figure for MathML and additional features.

As is illustrated in (12), $\rho _{n}$ in the state transition equation works here to adjust the covariance matrix especially after the inflection point, adjusting the sampling value ranges to accelerate the speed of the correct convergence of iteration process.

On condition that $\alpha $ and $\beta $ in $Ga\left ({\alpha,\beta }\right)$ are relatively large positive values, Gamma distribution almost tends towards Gaussian distribution. Nevertheless, the difference is that the domain of definition of $Ga\left ({\alpha,\beta }\right) $ is non-negative, which makes the signs of all the parameters unchanged with the sampled particles in a similar distribution to using Gaussian noise, where $E\left ({X }\right)=\alpha \cdot \beta ^{-1}=\frac {\overline x_{k}^{i^{2}}\cdot \hat {P}_{k}^{i^{-1}}}{\overline x_{k}^{i}\cdot \hat {P}_{k}^{i^{-1}}}=\overline x_{k}^{i}$ and $var\left ({X }\right)=\alpha \cdot \beta ^{-2}=\frac {\overline x _{k}^{i^{2}}{\cdot \hat {P}_{k}^{i^{-1}}}}{\left ({\overline x _{k}^{i}\cdot \hat {P}_{k}^{i^{-1}}}\right)^{2}}=\hat {P}_{k}^{i}$ . As a result of the conversion, the same tendency remains in the shape of the capacity degradation curve, and the basic operation requirement of the UKF part in non-linear problems is satisfied to a large extent, compared with the undesired likelihood that the prediction curve would be ascending later because of the changed sign of a certain parameter when the particles are sampled with Gaussian noise.

Step 6 (Weight Calculation) \begin{align*} \omega _{k}^{i}=&\frac {p\left ({x_{0: k}^{i}\vert z_{1: k} }\right)}{q\left ({x_{0: k}^{i}\vert z_{1: k} }\right)}=\omega _{k-1}^{i}\frac {p\left ({z_{k}\vert x_{k}^{i} }\right)q\left ({x_{k}^{i}\vert x_{k-1}^{i} }\right)}{q\left ({x_{k}^{i}\vert x_{k-1}^{i},z_{k} }\right)}\quad \tag{40}\\ \omega _{k}^{i}=&\frac {\omega _{k}^{i}}{\sum \nolimits _{i=1}^{N} \omega _{k}^{i} }\tag{41}\end{align*} View SourceRight-click on figure for MathML and additional features. where $p$ denotes the pdf of the prior distribution, and $q$ denotes the pdf of updated particles in Gamma distribution with parameters calculated by (39), and (40) is effective as long as $q\left ({x_{k}^{i}\vert x_{k-1}^{i},z_{1: k} }\right)=q\left ({x_{k}^{i}\vert x_{k-1}^{i},z_{k} }\right)$ is satisfied, with the recent measurement incorporated in the design of importance function by the proposed state space equations.

Step 7 (Multinomial Resampling):\begin{equation*} \omega _{k}^{i}=\frac {1}{N}\tag{42}\end{equation*} View SourceRight-click on figure for MathML and additional features.

This step is commonly used to resolve the particle degeneracy phenomenon. Resampling method introduced in [28] or any others effective could be an alternative here.

Step 8 (State Update):\begin{align*} \tilde {x}_{k}^{i}=&\sum \nolimits _{n=1}^{N} {\omega _{k}^{i}x_{k}^{i}} \tag{43}\\ P_{k}^{i}=&\sum \nolimits _{n=1}^{N} {\omega _{k}^{i}\left [{ x_{k}^{i}-\tilde {x}_{k}^{i} }\right]\left [{ x_{k}^{i}-\tilde {x}_{k}^{i} }\right]^{T}}\tag{44}\end{align*} View SourceRight-click on figure for MathML and additional features.

After $k=k+1$ , return to step 4 until the iteration moment k arrives at $k_{end}$ .

Once the convergence speed of parameter value is slowed exceeding an experimental threshold, the last section of process will not have to continue anymore, thus $k_{end}$ is attained earlier than standard UPF, which means less training information is necessary making further expansion of the advantages of UPF sampling in local iteration.

Step 9 (RUL Estimation):

RUL estimation value $L_{k}^{i}$ of the $i$ th particle using the training data up to the moment k can be solved by, \begin{align*} a_{k}^{i}\cdot e^{\frac {b_{k}^{i}}{\left ({k+L_{k}^{i} }\right)}}+c_{k}^{i}\cdot e^{d_{k}^{i}\cdot \left ({k+L_{k}^{i} }\right)}=&C_{eol} \tag{45}\\&\hspace {-8pc}\overline L_{k}=\sum \limits _{i=1}^{N} {\omega _{k}^{i}L_{k}^{i}}\tag{46}\end{align*} View SourceRight-click on figure for MathML and additional features.

The initial point $(0,H_{0})$ and the EOL point $(x_{EOL},C_{eol})$ are linked, to which the furthest point curve is determined as the true capacity diving point. The slope of the capacity diving point can be obtained by $\left ({C_{eol}-H_{0} }\right)\cdot x_{EOL}^{-1}$ .

First order differential function of the measurement equation is shown in (47), \begin{align*} H^{\prime }\left ({x }\right)=-\frac {abe^{\frac {b}{x}}}{x^{2}}+{cde}^{dx},\quad x\in N^{+},~ c>0,~a,b,d< 0\!\!\! \\{}\tag{47}\end{align*} View SourceRight-click on figure for MathML and additional features. and the capacity diving point $x_{d}$ can be calculated by (48), \begin{align*} \begin{cases} \overline a_{k}\cdot e^{\overline b_{k} \cdot x_{EOL}^{-1}}+\overline c_{k}\cdot e^{\overline d_{k}\cdot x_{EOL}}=C_{eol} \\ -\frac {{\overline a_{k}\overline b_{k}e}^{\frac {\overline b _{k}}{x_{d}}}}{x_{d}^{2}}+{\overline c_{k}\overline d_{k}e}^{\overline d _{k}x_{d}}=\left ({C_{eol}-H_{0} }\right)\cdot x_{EOL}^{-1} \\ x_{EOL}=k+\overline L_{k},\\ 0< x_{d}< x_{EOL} \end{cases}\tag{48}\end{align*} View SourceRight-click on figure for MathML and additional features.

Actually, on condition that the EOL point is determined, the relevant prediction of both the capacity diving point and the RUL would be regarded as the same in degree of accuracy after calculating (45) and (48). So the prediction performance of the EOL point is representatively selected as the indicator in the following discussion to evaluate the proposed method in the paper.

SECTION IV.

Experimental Verifications and Discussion

To verify the performance of all proposed methods, case studies are carried out where 15 pieces of Lithium-ion batteries with Li(NiMnCo)O2 cathode (Rated capacity 36 Ah, Energy density 180 Wh/kg) are conducted cycle aging tests under different regular temperature conditions and regular current ratios. The observed capacity degradation data is shown in Fig. 3 with distinct capacity diving seen in the later part of each curve without exception, and the data points used to plot curves are generated from the mean value of each 3 batteries under the same working condition respectively.

FIGURE 3. - Raw capacity degradation data.
FIGURE 3.

Raw capacity degradation data.

A. Verification of the Fitting Performance

Using statistical indicators in Table 2, comparisons are firstly made by least square fitting showing that under the majority of circumstances, the proposed model and the double exponential model are excellent to fit the curves with very little SSE and RMSE, as well as R2 indicators extremely close to 1.

TABLE 2 Fitting Performance of the Models
Table 2- 
Fitting Performance of the Models

Besides, the proposed model attains better performance than the double exponential model in spite of almost the same under the 35°C1C1C condition, which is attributed to few data collected after the inflection point and very long flat early data curve. That is to say, the double exponential model is too smooth to describe a majority of the conditions with a relative sharp variation tendency. Moreover, the performance of 3-parameter empirical model indicates a really bad fitness to the situation.

According to the capacity diving degradation shape and average value range of inflection point estimated, the sensitivity of the primary terms representing such a tendency of the curve is compared with an inflection point $x_{i}\sim 1e2$ empirically.

It is well acknowledged that sensitivity of a parameter illustrates the width of corresponding confidence intervals, hence the function term $ae^{\frac {b}{x}}$ owning small values in the critical neighborhood shown in Table 3 is preferably selected to the algorithm application for rapid convergence with the same amount of training information.

TABLE 3 Sensitivity of the Parameters of the Term Representing the Capacity Diving
Table 3- 
Sensitivity of the Parameters of the Term Representing the Capacity Diving
TABLE 4 Least Square Fitting of 25°C1C1C Data up to the Inflection Point
Table 4- 
Least Square Fitting of 25°C1C1C Data up to the Inflection Point

B. Verification of the Proposed Preprocessing Method

The case under the condition of normalized 37Ah 35°C1C1C preprocessed is partly shown in Fig. 4. Fitting the curves using the proposed model in (9), the MSE considering the capacity recovery effect is reduced by more than 15.54% on the part of raw data, and reduced by approximately 13.81% on the part of the curve preprocessed without considering the effect, of which the bias by direct smoothing in latter case is enormously intensified, attributed to the redundant unrelated information of the self-recharged capacity not involved in the empirical model $H\left ({x }\right)$ .

FIGURE 4. - Part of the preprocessed curve with and without the capacity self-recharge effect considered.
FIGURE 4.

Part of the preprocessed curve with and without the capacity self-recharge effect considered.

C. Verification of Prediction Accuracy With Capacity Diving

The 584th is calculated as the true EOL point, and the 581st as the actual final prediction of the 491st point with a percentage error of 0.513%, and an absolute error of 3 cycles shown in Fig. 6. It could also be concluded from iteration curve of parameter b in Fig. 5, that the prediction will show gradually fluctuation over the true value with reducing amplitude, making it unnecessary to train more subsequent points, thus ending the iteration.

FIGURE 5. - (a)Prediction curves starting at the corresponding moment during UPF iteration procedure, and (b)Parameter b value during latest UPF iteration procedure.
FIGURE 5.

(a)Prediction curves starting at the corresponding moment during UPF iteration procedure, and (b)Parameter b value during latest UPF iteration procedure.

FIGURE 6. - (a)The EOL point calculated by the proposed method, and (b) calculated EOL point distribution of particles after the 491st iteration.
FIGURE 6.

(a)The EOL point calculated by the proposed method, and (b) calculated EOL point distribution of particles after the 491st iteration.

To further illustrate the convergence tendency of the last several iterations, parameter c of the simplex exponential term in (9) representing the early stage amplitude, is given a value far from the true value shown in Fig. 7.

FIGURE 7. - (a)Prediction curves starting at the corresponding moment during UPF iteration procedure, and (b)Parameter c value during UPF iteration procedure.
FIGURE 7.

(a)Prediction curves starting at the corresponding moment during UPF iteration procedure, and (b)Parameter c value during UPF iteration procedure.

It can be concluded from iteration curve of parameter c in Fig. 7(b), that the parameter value c shows undoubted gradual fluctuation over the true value with reducing amplitude, which is a strong evidence proving the reasonability to end the iteration in the way described in the last section. And a similar prediction result of the EOL point is shown in Fig. 8, with little difference in particle distribution from Fig. 6.

FIGURE 8. - (a)The EOL point calculated by the proposed method with different initial parameters, and (b) Calculated EOL point distribution of particles after the 491st iteration with different initial parameters.
FIGURE 8.

(a)The EOL point calculated by the proposed method with different initial parameters, and (b) Calculated EOL point distribution of particles after the 491st iteration with different initial parameters.

However, the slowed degradation tendency to the true value of EOL point should not be regarded as the signal to end the iteration, for a very sharp decline appears in the later curve with capacity diving. Instead, it is well founded that the covariance matrix of the transition equation indicates the iteration procedure near the end by a gradual slowed variation of the algorithm parameters. That is to say, even though some of the parameters reached the true value at certain moment, the EOL point calculated could have differed from the true one, and vice versa. The same deduction can be drawn from the iteration curve of parameters to be verified in Fig. 9.

FIGURE 9. - (a)Value of parameter c during UPF iteration procedure, (b)Value of parameter b during UPF iteration procedure, and (c)Predicted curves starting at the corresponding moment during the iteration procedure of UPF with initial parameters further to true fitting values of 35°C1C1C.
FIGURE 9.

(a)Value of parameter c during UPF iteration procedure, (b)Value of parameter b during UPF iteration procedure, and (c)Predicted curves starting at the corresponding moment during the iteration procedure of UPF with initial parameters further to true fitting values of 35°C1C1C.

D. Verification of Prediction Accuracy With Poor Initial Data Quality

The 809th is selected as the end of prediction point judged from Fig. 9(b), for an obvious gradual tendency approaching the true value with reducing amplitude can be seen in parameter b iteration curve, making it unnecessary to train more subsequent points. The 970th is calculated as the true EOL point, and the 965th as the predicted value of the 809th point with a percentage error of 0.515%, and an absolute error of 5 cycles shown in Fig. 10.

FIGURE 10. - (a)The EOL point calculated by the proposed method, and (b) calculated EOL point distribution of particles after the 809th iteration.
FIGURE 10.

(a)The EOL point calculated by the proposed method, and (b) calculated EOL point distribution of particles after the 809th iteration.

Though the activation effect of the batteries in the beginning cycles brings about several initial training data with poor quality, containing information not involved in the model within the orange circled part of the curve in Fig. 9, they have little influence on subsequent prediction, showing the excellent local iteration characteristics of UPF thanks to the quality-enhanced data by the proposed preprocessing method.

E. Verification of Fitting Prediction Accuracy in Early Curves

The proposed model shows satisfying performance in dealing with capacity diving degradation data sets, leading to another important qualitative comparison made to validate the performance when the models deal with the early part of the capacity degradation curve. In other words, if the capacity were not declining in a rapidly plunging way seen in the later curve, the performance should be figured out by fitting a relatively flat capacity degradation curve using the proposed model, \begin{align*} H\left ({x }\right)\!=\!{ae}^{\frac {b}{x}}+{ce}^{dx}+v_{x},\quad x\in N^{+},~c>0,~ a,b,d< 0 \\{}\tag{49}\end{align*} View SourceRight-click on figure for MathML and additional features. and the double-exponential model with restricted value ranges, \begin{align*}H\left ({x }\right)&={a_{1}e}^{b_{1}x}+{a_{2}e}^{b_{2}x}+v_{x},\quad x\in N^{+},~a_{1}, b_{2}>0, \\&\qquad \qquad \qquad \qquad \qquad \qquad \quad \qquad {a_{2},b_{1}< 0} \tag{50}\end{align*} View SourceRight-click on figure for MathML and additional features.

The restricted value ranges bring the feasibility to make comparisons, because in order to ensure the stability of the algorithm, model parameters should not change their signs during iteration, which prevents the shape of the model shifting in obscure after the assumed inflection point.

Respectively, verification is made using 1/3 of the total unaccelerated degradation data as the training data, up to 161st, and 1/2 up to 242nd in Fig. 11. It is evident that fitting result of the two kinds of feasible capacity diving degradation models, with a number of early data shows different subsequent accuray. More training information leads to better early part fitting performance of both equations, but the proposed shape in (48) is qualitatively much better fitted than the double exponential shape in (50).

FIGURE 11. - Least square fitting results of 25°C1C1C data up to the inflection point (a) 1/3 as fitting data (b) 1/2 as fitting data.
FIGURE 11.

Least square fitting results of 25°C1C1C data up to the inflection point (a) 1/3 as fitting data (b) 1/2 as fitting data.

F. Performance of Different Model and Methods

Fitting performance comparison of the whole data between measurement models has been implemented in Table 2, and using the same training data under two conditions, prediction performance of the proposed model using improved UPF is compared with the conventional double exponential model with particle filter based methods in Table 5, where reduced error of estimation and narrowed width of confidence intervals by the proposed method can be obtained, implying the expected improvements achieved.

TABLE 5 Performance of Different Model and Methods
Table 5- 
Performance of Different Model and Methods

SECTION V.

Conclusion

A systematic RUL prognostic method has been proposed in this paper, for Lithium-ion batteries with Li(NiMnCo)O2 cathode with distinct capacity diving in later lifetime cycles using improved UPF.

  1. In pursuit of further accuracy of the model, the quality of the training data is enhanced by taking the capacity recovery phenomenon into account, as well as using Savitzky-Golay filtering. With the training data such preprocessed after outlier elimination, a decrease in the fitting RSE of training data can be attained.

  2. In order to use the empirical model in RUL prediction, an appropriate measurement equation is proposed under the consideration of capacity diving in later capacity degradation curve. Validation result indicates that the fitting performance and parameter sensitivity could be found more excellent statistically when the proposed model is applied to almost all of the cases compared with others including popular double-exp.

  3. An improved state transition model has been utilized in UPF iteration containing the sectioned treatment technique of sampling by using inflection point checking and systematic noise in Gamma distribution. To find out when to end the iteration, prediction curves during UPF iterations have been drawn with the accelerated convergence tendency implied clearly, and potential shape shifting of the prediction curve is prevented, thus robustness of the method guaranteed.

  4. An overall flowchart of the proposed systematic framework contained is developed with the steps summarized. Due to the improved state space equations, it has been achieved that using the same amount of training information, enhanced accuracy of the prediction could be obtained in predicting EOL point effectively, which is also true of the fitting prediction before the inflection point.

Owing to the restrictions of training data acquisition, the source of the training information discussed in this paper is limited to the cyclic capacity data only. As a supplement to the space state equations, incremental capacity analysis (ICA) of cyclic capacity-voltage data and other methods will be very helpful to the verification of capacity diving in our future work, which could also be extended by improving the analytical form of the empirical capacity degradation model or combining the proposed UPF with data-driven methods to reduce the error.

References

References is not available for this document.