Journals & Magazines >IEEE Journal of Selected Topi... >Volume: 14

Adaptive Nonnegative Sparse Representation for Hyperspectral Image Super-Resolution

Abstract:

As the hyperspectral images (HSIs) usually have a low spatial resolution, HSI super-resolution has recently attracted more and more attention to enhance the spatial resol...Show More

Topic: Super-resolution of Remotely Sensed Images

Metadata

Abstract:

As the hyperspectral images (HSIs) usually have a low spatial resolution, HSI super-resolution has recently attracted more and more attention to enhance the spatial resolution of HSIs. A common method is to fuse the low-resolution (LR) HSI with a multispectral image (MSI) whose spatial resolution is higher than the HSI. In this article, we proposed a novel adaptive nonnegative sparse representation-based model to fuse an HSI and its corresponding MSI. First, basing the linear spectral unmixing, the nonnegative structured sparse representation model estimates the sparse codes of the desired high-resolution HSI from both the LR-HSI and the MSI. Then, the adaptive sparse representation can balance the relationship between the sparsity and collaboration by generating a suitable coefficient. Finally, in order to obtain more accurate results, we alternately optimize the spectral basis and coefficients rather than keeping the spectral basis fixed. The alternating direction method of multipliers is applied to solve the proposed optimization problem. The experimental results on both ground-based HSIs and real remote sensing HSIs show the superiority of our proposed approach to some other state-of-the-art HSI super-resolution methods.

Topic: Super-resolution of Remotely Sensed Images

Published in: IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing ( Volume: 14)

Page(s): 4267 - 4283

Date of Publication: 09 April 2021

ISSN Information:

DOI: 10.1109/JSTARS.2021.3072044

Funding Agency:

Contents

CCBY - IEEE is not the copyright holder of this material. Please follow the instructions via https://creativecommons.org/licenses/by/4.0/ to obtain full-text articles and stipulations in the API documentation.

SECTION I.

Introduction

Hyperspectral (HS) imaging has attracted wide attention in recent years since it can simultaneously obtain images of the same scenario across plenty of different successive wavelengths at the same time [1]–[3]. Because hyperspectral image (HSI) has rich spectral information, it has been widely used in many fields, such as target detection [4], environmental monitoring [5], military [6], and remote sensing [7]. However, since there is a limited amount of incident energy in optical remote sensing systems, the imaging systems have to compromise between the spectral resolution and spatial resolution [8]. For example, HSIs captured by HYPXIM usually have more than one hundred spectral bands but only a decametric spatial resolution. Compared with HS imaging sensors, the multispectral (MS) imaging sensors can provide multispectral images (MSIs) with much higher spatial resolution but with a limited number of spectral bands. For example, the PLEIADES can provide MSIs with a spatial resolution of 70 cm but with only three or four spectral bands. In order to enhance the spatial resolution of HSIs, researchers have made much effort. A popular approach to reconstruct the high spatial resolution HSI (HR-HSI) is to fuse the high spatial resolution MSI (HR-MSI) with the low spatial resolution HSI (LR-HSI) [9], [10]. This approach is called HSI–MSI fusion or HSI super-resolution.

HSI super-resolution problem aims to reconstruct an HR-HSI by fusing the spectral information of an LR-HSI and the spatial information of an HR-MSI, as illustrated in Fig. 1. Note that the LR-HSI and the HR-MSI should be the same scene. The target HSI should not only have a good visual effect but also ensure the authenticity of each pixel.

Fig. 1.

HSI super-resolution problem.

Show All

A large number of studies have been done on HSI super-resolution. A special situation of HSI super-resolution is pansharpening, which fuses an LR-HSI with its corresponding panchromatic (PAN) image [9], [11]. A variety of pansharpening methods have been proposed over the past two decades. Generally, these methods can be categorized into two classes, i.e., transform-based methods [12]–[14] and variational methods [15]–[17]. However, because the PAN images have little spectral resolution, there are usually considerable spectral distortions in the HR-HSIs reconstructed by these pansharpening methods.

As the MSIs contain more spectral information than the PAN images, in recent work, HSI–MSI fusion, which can be seen as the extension of pansharpening, has drawn more attention. Yokoya et al. [10] present a comparative review of several HSI–MSI fusion techniques. Typically, the HSI–MSI fusion methods can be divided into four categories: component substitution (CS), Bayesian, deep learning, and sparse representation.

In the CS-based approaches, a basic idea is to substitute one component of the HSI with the high-resolution (HR) image. For example, the intensity-hue-saturation (IHS) [18], [19] method replaces the intensity component in the IHS domain of the LR image with the PAN image. The principal component analysis [20] method uses the HR image to replace the first principal component of the LR-HSI. However, the CS-based approaches usually result in spectral distortions in the obtained HR-HSI.

The Bayesian-based approaches introduce the appropriate prior distribution of the HR-MSI, such as naive Gaussian [21], [22] and sparsity promoting prior [23], [24] to achieve the accurate estimation. The variational methods can be regarded as a special case of the Bayesian one. The target images are estimated by minimizing the objective function, which is structured by the posterior probability density of the fused image. Among these methods, HS super-resolution [25] uses the vector-total-variation-based regularization in the objective function. Zhang et al. [26], [27] introduced a method that works in the wavelet domain and later published an expectation–maximization algorithm to maximize the posterior distribution.

Since the deep learning has been demonstrated to be very effective in object detection [28]–[30], classification [31]–[33], and natural image super-resolution [34]–[36], many researchers have introduced deep learning into HSI super-resolution. Li et al. [37] proposed to learn an end-to-end spectral difference mapping between the LR-HSI and HR-HSI through a deep spectral difference convolution neural network. Yuan et al. [38] proposed a multiscale and multidepth convolution neural network to achieve the HR-HSI. In order to take the advantage of the spectral correlation and exploit the HR-MSI, Yang et al. [39] presented a convolution neural network with two branches. With the two branches convolution neural network, the spectrum features of each pixel and its corresponding spatial neighborhood are extracted from the LR-HSI and the HR-MSI, respectively. Dian et al. [40] proposed to learn the spectral prior of HSI via deep residual convolutional neural networks. In addition to the convolution neural network, a stacking sparse denoising autoencoder-based deep neural network is proposed by Huang et al. [41] for pansharpening. Although the deep learning based methods obtained great reconstruction results, these kinds of methods need large amounts of training samples to estimate the parameters.

In the past years, the sparse representation has been widely used in remote sensing applications [42]. The sparse representation-based HSI super-resolution methods usually represent the targeted HR-HSI image by the product of a spectral basis matrix and a coefficient matrix, where the spectral basis and coefficient matrices can be extracted from the LR-HSI and the HR-MSI. Besides, some matrix factorization and unmixing-based methods can also be regarded as the sparse representation-based method because the source images are decomposed into spectral bases and coefficients. Actually, the sparse representation-based methods are usually combined with matrix factorization and spectral unmixing. Based on the unsupervised spectral unmixing, Yokoya et al. [43] proposed a coupled nonnegative matrix factorization (CNMF) approach to estimate the HSI endmember matrix and the HR abundance matrix. However, the nonnegative matrix factorization is usually not unique. So, Yokoya et al. [43] cannot always obtain satisfactory results. Huang et al. [44] used the k-singular value decomposition (K-SVD) algorithm [45] to learn the spectral basis and proposed a sparse prior-based matrix factorization method to fuse the remote sensing MSI at different spatial and spectral resolution. Zhang et al. [46] used the group spectral embedding and low-rank factorization to fuse the LR-HSI and HR-MSI. Lanaras et al. [47] proposed to jointly solve the spectral unmixing problems for both input images. However, only using a spectral dictionary is insufficient for preserving spatial information, and vice-versa. To address this problem, an HSI–MSI fusion method termed optimized twin dictionaries (OTD) using optimized twin dictionaries was proposed by Han et al. [48]. Since the pixelwise sparse representation neglects the similarity among neighbor pixels, Akhtar et al. [49] proposed to utilize the similarities among the spectral pixels in the same local patch and obtain the coefficients with a generalization of simultaneous orthogonal matching pursuit (G-SOMP₊) algorithm for each local patch. Later, Akhtar et al. [50] proposed a Bayesian dictionary learning and Bayesian sparse coding approach for HSI super-resolution and achieved improved performance. Note that, the structures of MSI are usually very complex, and thus, a fixed local window may still contain different variations. Combined with superpixel segmentation methods, Fang et al. [51] proposed a superpixel-based sparse representation (SSR) model, which ensured that the shape and size of each superpixel can adaptively adjust according to the spatial structures of MSI, and therefore, the spatial structures of spectral pixels in each superpixel are similar for HSI super-resolution. Furthermore, Dong et al. [52] proposed a nonnegative structured sparse representation (NSSR) method, which exploited a clustering-based structured sparse coding approach to ensure the spatial correlation among the obtained sparse coefficients.

Sparse representation-based approaches are indeed effective for HSI super-resolution and achieve great reconstruction results. However, the existing methods usually use an ${l_1}$ -norm to constrain the representation coefficients, and thus, only the sparsity is taken into consideration. Sometimes the constraint of ${l_1}$ -norm is not reasonable because there is not only the sparsity but also some correlation among the representation coefficients. Another extreme case, which only uses an ${l_2}$ -norm to constrain the representation coefficients, is to consider only the correlated information. Therefore, a more reasonable choice is to take the sparsity and correlation simultaneously into consideration. Inspired by the trace least absolute shrinkage and selection operator (LASSO) [53], [54], we propose a novel spatial–spectral adaptive nonnegative sparse representation (ANSR) method for HSI super-resolution by fusing the LR-HSI and the corresponding HR-MSI. The proposed method integrates sparsity and correlation effectively as a regularization term in the model and can produce more suitable coefficients adaptively with the constraint between ${l_1}$ -norm and ${l_2}$ -norm. Specifically, the estimation of HR-HSI is formulated as a joint estimation of spectral basis and sparse coefficients from the LR-HSI and HR-MSI with the prior knowledge of spatial–spectral sparsity and spectral unmixing. According to the spectral mixture model [55], the spectral basis and sparse coefficients need to be nonnegative, and the sparse coefficients often meet the sum-to-one constraint. Besides, based on trace LASSO, we utilize the adaptive sparse representation (ASR) to balance the sparsity and correlation and can obtain more precise sparse coefficients. Furthermore, we also design an alternative optimization algorithm to update the spectral basis and sparse coefficients, which is more flexible and accurate than keeping the spectral basis fixed. Meanwhile, the alternating direction method of multipliers (ADMM) is adopted to solve both the updating of the spectral basis and coefficients.

The main contributions of this article can be summarized as follows.

We introduce the ASR, which can obtain more precise sparse coefficients by balancing the sparsity and correlation of the coefficients, into the HSI super-resolution model.
Instead of keeping the spectral basis fixed, we alternately optimize the spectral basis and sparse coefficients.
We design two specific ADMM methods to update the spectral basis and sparse coefficients, respectively.
Experimental results on both ground-based HSIs and real remote sensing HSIs show that our ANSR method performs better than some other state-of-the-art HSI super-resolution methods.

The remainder of this article is organized as follows. We briefly introduce the spectral dictionary learning method in [52] and the ASR in Section II. In Section III, we first formulate the problem of HSI super-resolution and then describe the details of the proposed ANSR method for HSI super-resolution. Extensive experiments and comparisons are shown in Section IV. Finally, Section V concludes this article.

SECTION II.

Related Work

In this section, we introduce the spectral dictionary learning method in [52] and the ASR, which are used in our method.

A. Spectral Dictionary Learning

We denote the LR-HSI as ${\boldsymbol X} \in {\mathbb{R}^{B \times n}}$ , where $B$ represents the spectral dimension and $n$ represents the number of pixels. As each pixel in the LR-HSI $\boldsymbol{X}$ can be written as the linear combination of a small number of spectral pixels, we can express $\boldsymbol{X}$ as the product of a spectral dictionary $\boldsymbol{D}$ and a coefficient matrix $\boldsymbol{B}$ . The formulation is as follows: $\begin{equation*} \boldsymbol{X}\ = \ \boldsymbol{D}\boldsymbol{B} + \boldsymbol{V} \tag{1} \end{equation*}$ View Sourcewhere $\boldsymbol{V}$ denotes the approximation error matrix, which is assumed to be additive Gaussian.

In (1), both $\boldsymbol{D}$ and $\boldsymbol{B}$ are unknown. Generally, there are infinite possible decompositions of (1) and a unique decomposition cannot be determined. Fortunately, with the help of the sparsity assumption, we can solve $\boldsymbol{D}$ and $\boldsymbol{B}$ using the sparse nonnegative matrix decomposition. Therefore, the spectral dictionary $\boldsymbol{D}$ can be estimated by solving the following sparse nonnegative matrix decomposition problem: $\begin{align*} \left({\boldsymbol{D},\ \boldsymbol{B}} \right) =& \ {\rm{arg}}\mathop {\min }\limits_{\boldsymbol{D},\boldsymbol{B}} \frac{1}{2}\|\boldsymbol{X} - \boldsymbol{D}\boldsymbol{B}\|_F^2 + \lambda \|\boldsymbol{B}{\|_1} \\ {\rm{s}}.{\rm{t}}.\ \boldsymbol{B} \ge& 0,\ \boldsymbol{D} \geq 0 \tag{2} \end{align*}$ View Source

Because that the sparse coefficient matrix $\boldsymbol{B}$ and the spectral dictionary $\boldsymbol{D}$ are constrained to be nonnegative, existing dictionary algorithms (e.g., K-SVD algorithm and online dictionary learning algorithm) are all invalid. In order to solve the above sparse nonnegative matrix decomposition problem, a computationally efficient nonnegative dictionary learning algorithm, which solves (2) by updating $\boldsymbol{D}$ and $\boldsymbol{B}$ alternately, is proposed in [52].

With $\boldsymbol{D}$ fixed, the subproblem with respect to $\boldsymbol{B}$ becomes $\begin{equation*} \boldsymbol{B}\ = \ {\rm{arg}}\mathop {\min }\limits_{\boldsymbol{B}} \frac{1}{2}\|\boldsymbol{X} - \boldsymbol{D}\boldsymbol{B}\|_F^2 + \lambda \|\boldsymbol{B}{\|_1},\ {\rm{s}}.{\rm{t}}.\ \boldsymbol{B} \geq 0 \tag{3} \end{equation*}$ View Sourcewhich can be efficiently solved by the ADMM technique. To apply ADMM, we introduce $\boldsymbol{S}\ = \ \boldsymbol{B}$ , and (3) can be reformulated as the following augmented Lagrangian function: $\begin{align*} L\!\left({\boldsymbol{B},\ \boldsymbol{S},\ \boldsymbol{U}} \right) =& \frac{1}{2}\ \|\boldsymbol{X} \!-\! \boldsymbol{D}\boldsymbol{B}\|_F^2 \!+\! \lambda \|\boldsymbol{B}{\|_1} \!+\! \mu \|\boldsymbol{S}\! -\! \boldsymbol{B} \!+\! \frac{{\boldsymbol{U}}}{{2\mu }}\|_F^2 \\ {\rm{s}}.{\rm{t}}.\ \boldsymbol{B} \ge& 0 \tag{4} \end{align*}$ View Sourcewhere $\boldsymbol{U}$ is the Lagrangian multiplier ( $\mu \geq 0$ ). Then, we solve $\boldsymbol{B}$ , $\boldsymbol{S}$ , and $\boldsymbol{U}$ alternately until convergence.

With $\boldsymbol{B}$ fixed, the subproblem with respect to $\boldsymbol{D}$ becomes $\begin{equation*} \boldsymbol{D}\ = \ {\rm{arg}}\mathop {\min }\limits_{\boldsymbol{D}} \|\boldsymbol{X} - \boldsymbol{D}\boldsymbol{B}\|_F^2,{\rm{\ s}}.{\rm{t}}.{\rm{\ }}\boldsymbol{D} \geq 0. \tag{5} \end{equation*}$ View Source

Similar to the online dictionary learning method, (5) is solved by using block coordinate descent. During each iteration, one column of $\boldsymbol{D}$ is updated while keeping the others fixed under the nonnegative constraint.

More information about the spectral dictionary learning method can be found in [52].

B. Adaptive Sparse Representation

As we all know, the goal of sparse representation is to encode a signal vector as a linear combination of a few dictionary atoms. Suppose that $\boldsymbol{x} \in {\mathbb{R}^m}$ is an input signal vector and $\boldsymbol{D}_{\boldsymbol{s}} \in {\mathbb{R}^{m \times K}}\ ({m \ll K})$ is a dictionary, with ${l_0}$ -norm as the regularization term, the sparse representation model of $\boldsymbol{x}$ takes the form $\begin{equation*} \mathop {\min }\limits_{\boldsymbol{\alpha }} \|\boldsymbol{x} - \boldsymbol{D}_{\boldsymbol{s}}\boldsymbol{\alpha }\|_2^2 + \lambda \|\boldsymbol{\alpha }{\|_0} \tag{6} \end{equation*}$ View Sourcewhere $\boldsymbol{\alpha }$ is the sparse coefficient of $\boldsymbol{x}$ and $\lambda$ is a regularization parameter.

However, the ${l_0}$ -minimization problem is NP-hard. Usually, the ${l_1}$ -norm, a reasonably convex surrogate of ${l_0}$ -norm, is chosen to replace the ${l_0}$ -norm. Then, the sparse optimization model takes the form $\begin{equation*} \mathop {\min }\limits_{\boldsymbol{\alpha }} \|\boldsymbol{x} - \boldsymbol{D} {_{\boldsymbol{s}}}\boldsymbol{\alpha }\|_2^2 + \lambda \|\boldsymbol{\alpha }{\|_1}. \tag{7} \end{equation*}$ View Source

The result of (7) can be solved by quadratic programming techniques, including basis pursuit [56], LASSO [57], etc.

Although the ${l_1}$ -regularization can make full use of the sparse information of signals, it completely ignores the correlated information of signals. Timofte et al. [58] proposed the collaborative representation that replaces the ${l_1}$ -norm of sparse representation model by ${l_2}$ -norm. The collaborative representation model takes the form as $\begin{equation*} \mathop {\min }\limits_{\boldsymbol{\alpha }} \|\boldsymbol{x} - \boldsymbol{D}{_{\boldsymbol{s}}}\boldsymbol{\alpha }\|_2^2 + \lambda \|\boldsymbol{\alpha }\|_2^2. \tag{8} \end{equation*}$ View Source

In contrast, the collaborative representation model only takes the correlation into consideration and completely ignores the sparsity. Actually, the best choice is to balance the sparsity and correlation and make a compromise between ${l_1}$ -norm and ${l_2}$ -norm. In order to overcome this problem, Zhao et al. [59] proposed a natural image super-resolution method with the property of trace LASSO [60], [61] $\begin{equation*} \|\boldsymbol{\alpha }{\|_2} \leq \|\boldsymbol{D}{_{\boldsymbol{s}}}{\rm{Diag}}\left({\boldsymbol{\alpha }} \right){\|_*} \leq \|\boldsymbol{\alpha }{\|_1} \tag{9} \end{equation*}$ View Sourcewhere $\| \cdot {\|_*}$ represents the kernel norm, which computes the sum of the singular values of a matrix, and ${\rm{Diag}}({\boldsymbol{\alpha }})$ obtains a diagonal matrix whose diagonal elements are the corresponding values in vector $\boldsymbol{\alpha }$ . When the columns of basis $\boldsymbol{D}{_{\boldsymbol{s}}}$ are almost uncorrelated, $\|\boldsymbol{D}{_{\boldsymbol{s}}}{\rm{Diag}}({\boldsymbol{\alpha }}){\|_*}$ will be close to $\|\boldsymbol{\alpha }{\|_1}$ . Conversely, $\|\boldsymbol{D}{_{\boldsymbol{s}}}{\rm{Diag}}({\boldsymbol{\alpha }}){\|_*}$ will be close to $\|\boldsymbol{\alpha }{\|_2}$ . In practice, the column vectors of a basis are neither too correlated nor too independent. So, the trace LASSO can make a compromise between ${l_1}$ -norm and ${l_2}$ -norm adaptively. We call it ASR. Then, the ASR model that can find a more suitable sparse coefficient takes the form $\begin{equation*} \mathop {\min }\limits_{\boldsymbol{\alpha }} \|\boldsymbol{x} - \boldsymbol{D}{_{\boldsymbol{s}}}\boldsymbol{\alpha }\|_2^2 + \lambda \|\boldsymbol{D}{_{\boldsymbol{s}}}{\rm{Diag}}\left({\boldsymbol{\alpha }} \right){\|_*}. \tag{10} \end{equation*}$ View Source

Based on the convexity of the model, this problem can be solved by some efficient methods. Among them, the alternation direction method of multipliers (ADMM) [62], [63] is a widely used method to find an approximate optimal solution.

SECTION III.

Proposed ANSR Method

In this section, we first provide a general introduction of the HSI super-resolution problem, including the linear spectral mixture model. Then, we introduce our super-resolution model in detail. Finally, we brief readers on the alternating optimization method thoroughly, including the optimization of coefficients and spectral basis.

In this article, the bold lowercase letters stand for the vectors and the bold uppercase letters stand for the matrices. The plain lowercase letters stand for the scalars.

A. Problem Formulation

The HSI super-resolution aims to recover an HR-HSI $\boldsymbol{Z} \in {\mathbb{R}^{B \times N}}$ from an LR-HSI $\boldsymbol{X} \in {\mathbb{R}^{B \times n}}$ and an HR-MSI $\boldsymbol{Y} \in {\mathbb{R}^{b \times N}}$ of the same scene, where $N\ = \ W \times H$ and $n\ = \ w \times h$ ( $w \ll W$ , $h \ll H$ ) denote the number of pixels in the HR-HSI $\boldsymbol{Z}$ and LR-HSI $\boldsymbol{X}$ , respectively. $B$ and $b$ ( $b \ll B$ ) indicate the spectral dimensions of $\boldsymbol{X}$ and $\boldsymbol{Y}$ , respectively.

In the linear spectral mixture model, each spectral vector $\boldsymbol{z}{_{\boldsymbol{i}}} \in {\mathbb{R}^B}$ of the target image $\boldsymbol{Z}$ can be represented by a linear combination of several spectral signatures [43], as shown in Fig. 2. Mathematically, we have $\begin{equation*} \boldsymbol{z}{_{\boldsymbol{i}}} = \ \boldsymbol{D}\boldsymbol{\alpha }{_{\boldsymbol{i}}} \tag{11} \end{equation*}$ View Sourcewhere $\boldsymbol{D} \in \mathbb{R}_ + ^{B \times K}$ is the spectral basis with $K$ atoms, and $\boldsymbol{\alpha }{_{\boldsymbol{i}}} \in {\mathbb{R}^K}$ represents the corresponding coefficient. Each column of $\boldsymbol{D}$ denotes a spectral vector of the underlying material in the scene. Considering pixels of the whole HSI, (11) can be rewritten as $\begin{equation*} \boldsymbol{Z}\ = \ \boldsymbol{D}\boldsymbol{A} \tag{12} \end{equation*}$ View Sourcewhere $\boldsymbol{Z}\ = [ {\boldsymbol{z}{_{\boldsymbol{1}}},\boldsymbol{z}{_{\boldsymbol{2}}}, \ldots,\boldsymbol{z}{_{\boldsymbol{N}}}} ]\$ and $\boldsymbol{A}\ = [ {\boldsymbol{\alpha }{_{\boldsymbol{1}}},\boldsymbol{\alpha }{_{\boldsymbol{2}}}, \ldots,\boldsymbol{\alpha }{_{\boldsymbol{N}}}} ]\ \in {\mathbb{R}^{K \times N}}$ .

Fig. 2.

Linear spectral mixture model.

Show All

Furthermore, both $\boldsymbol{X}$ and $\boldsymbol{Y}$ can be regarded as linear combinations of the target HSI $\boldsymbol{Z}$ . The LR-HSI $\boldsymbol{X}$ can be formulated as the linear spatial degradation of $\boldsymbol{Z}$ $\begin{equation*} \boldsymbol{X}\ = \ \boldsymbol{Z}\boldsymbol{H} \tag{13} \end{equation*}$ View Sourcewhere $\boldsymbol{H} \in {\mathbb{R}^{N \times n}}$ represents the spatial dimensionality degradation operator, including blurring and downsampling.

The HR-MSI $\boldsymbol{Y}$ can be formulated as the linear spectral degradation of $\boldsymbol{Z}$ $\begin{equation*} \boldsymbol{Y}\ = \ \boldsymbol{P}\boldsymbol{Z} \tag{14} \end{equation*}$ View Sourcewhere $\boldsymbol{P} \in {\mathbb{R}^{b \times B}}$ represents the spectral dimensionality downsampling matrix, which is the spectral response of the MS sensor.

By combining the linear mixture model (12) and the forward models (13) and (14), we have $\begin{align*} \boldsymbol{X}=& \boldsymbol{Z}\boldsymbol{H}\ = \ \boldsymbol{D}\boldsymbol{A}\boldsymbol{H}\ = \ \boldsymbol{D}\boldsymbol{B} \tag{15}\\ \boldsymbol{Y}=& \boldsymbol{P}\boldsymbol{Z}\ = \ \boldsymbol{P}\boldsymbol{D}\boldsymbol{A} \tag{16} \end{align*}$ View Source where $\boldsymbol{B}\ = \ \boldsymbol{A}\boldsymbol{H} \in {\mathbb{R}^{K \times n}}$ is a coefficient matrix with each column of $\boldsymbol{B}$ being a sparse vector.

According to the linear spectral mixture model (12), the HSI super-resolution problem can be transformed into the estimation of spectral basis $\boldsymbol{D}$ and representation coefficients $\boldsymbol{A}$ . And, $\boldsymbol{D}$ and $\boldsymbol{A}$ can be well estimated from the LR-HSI $\boldsymbol{X}$ and HR-MSI $\boldsymbol{Y}$ with (15) and (16), which will be elaborated in Section III-B.

B. Establishment of Our Model

As mentioned above, the HSI super-resolution problem can be transformed into the estimation of spectral basis $\boldsymbol{D}$ and representation coefficients $\boldsymbol{A}$ . According to (15) and (16), we can jointly estimate the spectral basis $\boldsymbol{D}$ and coefficients $\boldsymbol{A}$ from both the LR-HSI and HR-MSI. In this way, the HSI super-resolution problem can be written as $\begin{equation*} \mathop {\min }\limits_{\boldsymbol{D},\ \boldsymbol{A}} \|\boldsymbol{Y} - \boldsymbol{P}\boldsymbol{D}\boldsymbol{A}\|_F^2 + \|\boldsymbol{X} - \boldsymbol{D}\boldsymbol{A}\boldsymbol{H}\|_F^2. \tag{17} \end{equation*}$ View Source

Obviously, the above optimization problem is ill-posed, and the solutions of $\boldsymbol{D}$ and $\boldsymbol{A}$ are not unique. Therefore, we need some prior knowledge to constrain the solution space. Some common and effective priors include sparsity prior, nonlocal spatial similarities, and nonnegative prior.

The sparsity prior is known to be a very effective method to deal with the HSI super-resolution problem. With the sparsity constraint, we assume that each spectral pixel in the target HSI can be represented as a linear combination of a few distinct atoms of spectral basis. Then, the HSI super-resolution problem can be written as $\begin{equation*} \mathop {\min }\limits_{\boldsymbol{D},\ \boldsymbol{A}} \|\boldsymbol{Y} - \boldsymbol{P}\boldsymbol{D}\boldsymbol{A}\|_F^2 + \|\boldsymbol{X} - \boldsymbol{D}\boldsymbol{A}\boldsymbol{H}\|_F^2 + \eta \|\boldsymbol{A}{\|_1} \tag{18} \end{equation*}$ View Sourcewhere $\| \cdot {\|_1}$ stands for the sum of the absolute values of all elements in a vector and $\eta$ is a regularization parameter.

However, in (18), the sparse coefficients of each spectral pixel are estimated independently. It is generally known that a pixel of a typical HSI usually has a strong spatial correlation with its similar neighbors. In order to take advantage of the local and nonlocal similarities, we assume that a spectral pixel $\boldsymbol{z}{_{\boldsymbol{i}}}$ in the target HSI can be approximately represented as a linear combination of the pixels, which are similar to it. Combined with this nonlocal spatial similarity prior, (18) can be improved to $\begin{align*} & \mathop {\min }\limits_{\boldsymbol{D},\ \boldsymbol{A}} \|\boldsymbol{Y} - \boldsymbol{P}\boldsymbol{D}\boldsymbol{A}\|_F^2 + \|\boldsymbol{X} - \boldsymbol{D}\boldsymbol{A}\boldsymbol{H}\|_F^2 + {\eta _2}\|\boldsymbol{A}{\|_1} \\ &\quad + {\eta _1}\mathop \sum \limits_{q\ = \ 1}^Q \mathop \sum \limits_{i \in {S_q}} \|\boldsymbol{D}\boldsymbol{\alpha }{_{\boldsymbol{i}}} - \boldsymbol{\mu }{_{\boldsymbol{q}}}\|_2^2 \tag{19} \end{align*}$ View Sourcewhere $\boldsymbol{\mu }{_{\boldsymbol{q}}}$ represents the $q{\rm{t}}{{\rm{h}}^{}}$ cluster center, which can be seen as the linear combinations of pixels that are similar to the reconstructed spectral pixel $\boldsymbol{z}{_{\boldsymbol{i}}}$ . The centre $\boldsymbol{\mu }{_{\boldsymbol{q}}}$ of the $q{\rm{t}}{{\rm{h}}^{}}$ cluster can be computed as $\begin{equation*} \boldsymbol{\mu }{_{\boldsymbol{q}}} = \mathop \sum \limits_{i \in {S_q}} \boldsymbol{\omega }{_{\boldsymbol{i}}}\left({\boldsymbol{D}\boldsymbol{\alpha }{_{\boldsymbol{i}}}} \right)\ \tag{20} \end{equation*}$ View Sourcewhere $\boldsymbol{\omega }{_{\boldsymbol{i}}}$ denotes the weighting coefficients based on the similarity of the target HSI pixels. Since the HSI is unknown, we use the HR-MSI that has the same spatial information as the target HSI to computed $\boldsymbol{\omega }{_{\boldsymbol{i}}}$ . And the weighting coefficient $\boldsymbol{\omega }{_{\boldsymbol{i}}}$ is computed as $\begin{equation*} \boldsymbol{\omega }{_{\boldsymbol{i}}} = \frac{1}{c}{\rm{\ exp}}\left({\frac{{ - \|\boldsymbol{y}{_{\boldsymbol{i}}} - \boldsymbol{y}{_{\boldsymbol{q}}}\|_2^2}}{h}} \right) \tag{21} \end{equation*}$ View Sourcewhere c represents the normalization constant, and $\boldsymbol{y}{_{\boldsymbol{i}}}$ and $\boldsymbol{y}{_{\boldsymbol{q}}}$ represent two pixels of HR-MSI, respectively. In practice, the vector $\boldsymbol{\alpha }{_{\boldsymbol{i}}}$ is not known. We cannot achieve $\boldsymbol{\mu }{_{\boldsymbol{q}}}$ directly using the (20). To overcome this difficulty, we iteratively estimating $\boldsymbol{\mu }{_{\boldsymbol{q}}}$ from the current update of $\boldsymbol{\alpha }{_{\boldsymbol{i}}}$ . With the estimated $\boldsymbol{\mu }{_{\boldsymbol{q}}}$ and taking the whole image into consideration, we can rewrite (19) as $\begin{align*} & \mathop {\min }\limits_{\boldsymbol{D},\ \boldsymbol{A}} \|\boldsymbol{Y} - \boldsymbol{P}\boldsymbol{D}\boldsymbol{A}\|_F^2 + \|\boldsymbol{X} - \boldsymbol{D}\boldsymbol{A}\boldsymbol{H}\|_F^2 + {\eta _2}\|\boldsymbol{A}{\|_1} \\ & + {\eta _1}\|\boldsymbol{D}\boldsymbol{A} - \boldsymbol{U}\|_F^2 \tag{22} \end{align*}$ View Sourcewhere $\boldsymbol{U}\ = [ {\boldsymbol{\mu }{_{\boldsymbol{1}}},\boldsymbol{\mu }{_{\boldsymbol{2}}}, \ldots,\boldsymbol{\mu }{_{\boldsymbol{N}}}} ]\$ .

Besides, considering the physical characteristics of HSIs, the pixels of an HSI should be nonnegative. With this nonnegative prior, we can improve (22) to $\begin{align*} & \mathop {\min }\limits_{\boldsymbol{D},\ \boldsymbol{A}} \|\boldsymbol{Y} - \boldsymbol{P}\boldsymbol{D}\boldsymbol{A}\|_F^2 + \|\boldsymbol{X} - \boldsymbol{D}\boldsymbol{A}\boldsymbol{H}\|_F^2 + {\eta _2}\|\boldsymbol{A}{\|_1} \\ & \quad + {\eta _1}\|\boldsymbol{D}\boldsymbol{A} - \boldsymbol{U}\|_F^2,\ {\rm{s}}.{\rm{t}}.\ \boldsymbol{A} \geq 0,\ 0 \leq \boldsymbol{D} \leq 1. \tag{23} \end{align*}$ View Source

Furthermore, in order to balance the sparsity and correlation, we propose to use trace LASSO instead of the ${l_1}$ -norm to constrain the coefficients in this article. The trace LASSO can make a compromise between ${l_1}$ -norm and ${l_2}$ -norm adaptively, and we call it an ASR. Finally, the HSI super-resolution problem can be written as $\begin{align*} &\mathop {\min }\limits_{\boldsymbol{D},\ \boldsymbol{A}} \|\boldsymbol{Y} - \boldsymbol{P}\boldsymbol{D}\boldsymbol{A}\|_F^2 + \|\boldsymbol{X} - \boldsymbol{D}\boldsymbol{A}\boldsymbol{H}\|_F^2 + {\eta _1}\|\boldsymbol{D}\boldsymbol{A} - \boldsymbol{U}\|_F^2 \\ & + {\eta _2}\mathop \sum \limits_{i\ = \ 1}^N \|\boldsymbol{P}\boldsymbol{D}{\rm{Diag}}\left({\boldsymbol{\alpha }{_{\boldsymbol{i}}}} \right){\|_*}\\ &{\rm{s}}.{\rm{t}}.\ \boldsymbol{A} \geq 0,\ 0 \leq \boldsymbol{D} \leq 1 \tag{24} \end{align*}$ View Sourcewhere $\boldsymbol{\alpha }{_{\boldsymbol{i}}}$ represents the $i{\rm{t}}{{\rm{h}}^{}}$ column of the coefficient matrix $\boldsymbol{A}$ .

Once we have solved $\boldsymbol{D}$ and $\boldsymbol{A}$ , the target HR-HSI can be obtained by multiplying $\boldsymbol{D}$ by $\boldsymbol{A}$ .

C. Alternating Optimization of the Fusion Problem

It is obvious that (24) is highly nonconvex. However, the problem (24) is convex with respect to $\boldsymbol{D}$ and $\boldsymbol{A}$ , respectively. Therefore, we propose to alternately optimize the $\boldsymbol{D}$ and $\boldsymbol{A}$ , respectively, with the other one fixed. First, we initialize the spectral basis $\boldsymbol{D}$ using the spectral dictionary learning method in [52]. Then, the $\boldsymbol{D}$ and $\boldsymbol{A}$ are updated alternatively via (24). Specifically, we update $\boldsymbol{A}$ with $\boldsymbol{D}$ fixed, and then we update $\boldsymbol{D}$ with $\boldsymbol{A}$ fixed. These two steps are iterated until they converge. Finally, the target HR-HSI can be obtained through (12). The overall algorithm for the HSI super-resolution problem is summarized in Algorithm 1. In order to show the operation process of our method more intuitively, the flowchart of the proposed HSI super-resolution method is illustrated in Fig. 3.

Algorithm 1 : ANSR-Based HSI Super-Resolution.

Input: LR-HSI $\boldsymbol{X}$ ; HR-MSI $\boldsymbol{Y}$ ; spatial degradation operator $\boldsymbol{H}$ ; spectral transform matrix $\boldsymbol{P}$ ; and regularization parameters ${\eta _1}$ and ${\eta _2}$ .

Initialize the spectral basis $\boldsymbol{D}$ .

While not converge do

Update coefficient matrix $\boldsymbol{A}$ with $\boldsymbol{D}$ fixed.

Update spectral basis $\boldsymbol{D}$ with $\boldsymbol{A}$ fixed.

End while

Compute the desired HR-HSI $\boldsymbol{Z}$ via (12).

Output: HR-HSI $\boldsymbol{Z}$ .

Fig. 3.

Flowchart of the proposed HSI super-resolution method.

Show All

D. Optimization of the Coefficients With the Spectral Basis Fixed

In this procedure, we fix the spectral basis $\boldsymbol{D}$ . Then, the updating of coefficient matrix $\boldsymbol{A}$ can be written as $\begin{align*} & \mathop {\min }\limits_{\boldsymbol{A}} \|\boldsymbol{Y} - \boldsymbol{P}\boldsymbol{D}\boldsymbol{A}\|_F^2 + \|\boldsymbol{X} - \boldsymbol{D}\boldsymbol{A}\boldsymbol{H}\|_F^2 + {\eta _1}\|\boldsymbol{D}\boldsymbol{A} - \boldsymbol{U}\|_F^2 \\ &\quad + {\eta _2}\mathop \sum \limits_{i\ = \ 1}^N \|\boldsymbol{P}\boldsymbol{D}{\rm{Diag}}\left({\boldsymbol{\alpha }{_{\boldsymbol{i}}}} \right){\|_*},\ {\rm{s}}.{\rm{t}}.\ \boldsymbol{A} \geq 0 \tag{25} \end{align*}$ View Source

Obviously, the optimization problem (25) is convex and can be efficiently solved by ADMM, which can decompose the complex optimization problem into several easily solved subproblems. In specific, we introduce $\boldsymbol{S}\ = \ \boldsymbol{A}$ , $\boldsymbol{Q}{_{\boldsymbol{i}}} = \ \boldsymbol{P}\boldsymbol{D}{\rm{Diag}}({\boldsymbol{\alpha }{_{\boldsymbol{i}}}})$ , and $\boldsymbol{Z}\ = \ \boldsymbol{D}\boldsymbol{A}$ and can obtain the following augmented Lagrangian function: $\begin{align*} & L\ \left({\boldsymbol{A},\ \boldsymbol{S},\ \boldsymbol{Q},\boldsymbol{Z},\ \boldsymbol{V}{_{\boldsymbol{1}}},\ \boldsymbol{V}{_{\boldsymbol{2}}},\ \boldsymbol{V}{_{\boldsymbol{3}}}} \right) \\ &\quad= \|\boldsymbol{Y} - \boldsymbol{P}\boldsymbol{D}\boldsymbol{S}\|_F^2\ + \|\boldsymbol{X} - \boldsymbol{Z}\boldsymbol{H}\|_F^2 \\ &\quad\quad + {\eta _1}\|\boldsymbol{D}\boldsymbol{S} \!-\! \boldsymbol{U}\|_F^2 \!+\! {\eta _2}\mathop \sum \limits_{i\ = \ 1}^N \|\boldsymbol{Q}{_{\boldsymbol{i}}}{\|_*} \!+\! \mu \|\boldsymbol{D}\boldsymbol{S} - \boldsymbol{Z} + \frac{{\boldsymbol{V}{_{\boldsymbol{1}}}}}{{2\mu }}\|_F^2\\ &\quad\quad\!+\! \mu \|\boldsymbol{S} \!-\! \boldsymbol{A} \!+\! \frac{{\boldsymbol{V}{_{\boldsymbol{2}}}}}{{2\mu }}\|_F^2 \!\!+\!\! \mu \mathop \sum \limits_{i\ = \ 1}^N \|\boldsymbol{Q}{_{\boldsymbol{i}}}\! \!-\!\! \boldsymbol{P}\boldsymbol{D}{\rm{Diag}}\left({\boldsymbol{\alpha }{_{\boldsymbol{i}}}} \right) \!+\! \frac{{\boldsymbol{V}_{\boldsymbol{3}}^{\left({\boldsymbol{i}} \right)}}}{{2\mu }}\|_F^2\\ &\quad\quad {\rm{s}}.{\rm{t}}.\ \boldsymbol{A} \geq 0 \tag{26} \end{align*}$ View Sourcewhere $\boldsymbol{V}{_{\boldsymbol{1}}}$ , $\boldsymbol{V}{_{\boldsymbol{2}}},{\rm{\ }}$ and $\boldsymbol{V}{_{\boldsymbol{3}}}$ are the Lagrangian multipliers ( $\mu > 0$ ). Minimizing the augmented Lagrangian function (26) leads to the following iterations: $\begin{equation*} \begin{array}{rcl} {\boldsymbol{S}^{\left({\boldsymbol{t} + 1} \right)}} = {\rm{arg}}\mathop {\min }\limits_{\boldsymbol{S}} L\left({{\boldsymbol{A}^{\left(\boldsymbol{t} \right)}},\boldsymbol{S},{\boldsymbol{Q}^{\left(\boldsymbol{t} \right)}},{\boldsymbol{Z}^{\left(\boldsymbol{t} \right)}},\boldsymbol{V}_1^{\left(\boldsymbol{t} \right)},\boldsymbol{V}_2^{\left(\boldsymbol{t} \right)},\boldsymbol{V}_3^{\left(\boldsymbol{t} \right)}} \right)\\ {\boldsymbol{Z}^{\left({\boldsymbol{t} + 1} \right)}} = {\rm{arg}}\mathop {\min }\limits_{\boldsymbol S} L\left({{\boldsymbol{A}^{\left(\boldsymbol{t} \right)}}\boldsymbol{,}{\boldsymbol{S}^{\left(\boldsymbol{t} \right)}}\boldsymbol{,}{\boldsymbol{Q}^{\left(\boldsymbol{t} \right)}}\boldsymbol{,Z,V}_1^{\left(\boldsymbol{t} \right)},\boldsymbol{V}_2^{\left(\boldsymbol{t} \right)},\boldsymbol{V}_3^{\left(\boldsymbol{t} \right)}} \right)\\ {\boldsymbol{Q}^{\left({\boldsymbol{t} + 1} \right)}} = {\rm{arg}}\mathop {\min }\limits_{\boldsymbol S} L\left({{\boldsymbol{A}^{\left(\boldsymbol{t} \right)}}\boldsymbol{,}{\boldsymbol{S}^{\left(\boldsymbol{t} \right)}}\boldsymbol{,Q,}{\boldsymbol{Z}^{\left(\boldsymbol{t} \right)}}\boldsymbol{,V}_1^{\left(\boldsymbol{t} \right)},\boldsymbol{V}_2^{\left(\boldsymbol{t} \right)},\boldsymbol{V}_3^{\left(\boldsymbol{t} \right)}} \right)\\ {\boldsymbol{A}^{\left({\boldsymbol{t} + 1} \right)}} = {\rm{arg}}\mathop {\min }\limits_{\boldsymbol S} L\left({\boldsymbol{A,}{\boldsymbol{S}^{\left(\boldsymbol{t} \right)}}\boldsymbol{,}{\boldsymbol{Q}^{\left(\boldsymbol{t} \right)}}\boldsymbol{,}{\boldsymbol{Z}^{\left(\boldsymbol{t} \right)}}\boldsymbol{,V}_1^{\left(\boldsymbol{t} \right)},\boldsymbol{V}_2^{\left(\boldsymbol{t} \right)},\boldsymbol{V}_3^{\left(\boldsymbol{t} \right)}} \right). \end{array} \tag{27} \end{equation*}$ View Source

Meanwhile, the Lagrangian multipliers are updated by $\begin{align*} \boldsymbol{V}_1^{\left({\boldsymbol{t} + 1} \right)} =& \boldsymbol{V}_1^{\left(\boldsymbol{t} \right)}\ + \mu \left({\boldsymbol{D}{\boldsymbol{S}^{\left({\boldsymbol{t} + 1} \right)}} - {\boldsymbol{Z}^{\left({\boldsymbol{t} + 1} \right)}}} \right)\\ \boldsymbol{V}_2^{\left({\boldsymbol{t} + 1} \right)} =& \boldsymbol{V}_2^{\left(\boldsymbol{t} \right)}\ + \mu \left({{\boldsymbol{S}^{\left({\boldsymbol{t} + 1} \right)}} - {\boldsymbol{A}^{\left({\boldsymbol{t} + 1} \right)}}} \right)\\ \boldsymbol{V}_3^{\left(\boldsymbol{i} \right)\left({\boldsymbol{t} + 1} \right)} =& \boldsymbol{V}_3^{\left(\boldsymbol{i} \right)\left(\boldsymbol{t} \right)}\ + \mu \left({\boldsymbol{Q}_{\boldsymbol i}^{\left({\boldsymbol{t} + 1} \right)} - \boldsymbol{PDDiag}\left({\boldsymbol{\alpha }_{\boldsymbol i}^{\left({\boldsymbol{t} + 1} \right)}} \right)} \right).\tag{28} \end{align*}$ View Source

All the subproblems in (27) can be solved analytically, i.e., $\begin{align*} \boldsymbol{S}\ =& {\left[ {{{\left({\boldsymbol{PD}} \right)}^T}\left({\boldsymbol{PD}} \right) + \left({{\eta _1} + \mu } \right){\boldsymbol{D}^T}\boldsymbol{D} + \mu \boldsymbol{I}} \right]^{ - 1}}\\ &\bigg[ {{\left({\boldsymbol{PD}} \right)}^T}\boldsymbol{Y} + {\rm{\ }}{\eta _1}{\boldsymbol{D}^T}\boldsymbol{U} + \mu {\boldsymbol{D}^T}\left({\boldsymbol{Z} - \frac{{{\boldsymbol{V}_1}}}{{2\mu }}} \right) \\ &+ \mu \left({\boldsymbol{A} - \frac{{{\boldsymbol{V}_2}}}{{2\mu }}} \right) \bigg]\\ \boldsymbol{Z} =& \left[ {\boldsymbol{X}{\boldsymbol{H}^T} + \mu \left({\boldsymbol{DS} + \frac{{{\boldsymbol{V}_1}}}{{2\mu }}} \right)} \right]{\rm{\ }}{\left({\boldsymbol{H}{\boldsymbol{H}^T} + \mu \boldsymbol{I}} \right)^{ - 1}} \end{align*}$ View Sourceas to the solutions of Q and A, we need to solve them pixel by pixel $\begin{align*} {\boldsymbol{Q}_{\boldsymbol i}} =& {\mathcal{J}_{\frac{{{\eta _2}}}{{2\mu }}}}\ \left[ {\boldsymbol{PD}{\rm{Diag}}\left({{\boldsymbol{\alpha }_{\boldsymbol i}}} \right) - \frac{{\boldsymbol{V}_3^{\left(\boldsymbol{i} \right)}}}{{2\mu }}} \right]\\ {\boldsymbol{\alpha }_{\boldsymbol i}} =& \left\{ {{{\left[ {2\mu \boldsymbol{I} + 2\mu {\rm{Diag}}\left({{\rm{Diag}}\left({{{\left({\boldsymbol{PD}} \right)}^T}\left({\boldsymbol{PD}} \right)} \right)} \right)} \right]}^{ - 1}}} \right.\\ &\left[ 2\mu {\boldsymbol{s}_{\boldsymbol i}} + \ \boldsymbol{V}_2^{\boldsymbol i} + {\rm{Diag}} \left({{{\left({\boldsymbol{PD}} \right)}^T}\boldsymbol{V}_3^{\left(\boldsymbol{i} \right)}} \right)\right.\\ &\left. \left. + 2\mu {\rm{Diag}}\left({{{\left({\boldsymbol{PD}} \right)}^T}{\boldsymbol{Q}_{\boldsymbol i}}} \right) \right] \right\}_ {+} \tag{29} \end{align*}$ View Sourcewhere ${\mathcal{J}_ \cdot }(\cdot)$ represents the singular value soft-thresholding operator [64].

The update of variables and multipliers is alternately iterated until convergence. The overall algorithm for updating coefficient matrix $\boldsymbol{A}$ is summarized in Algorithm 2.

Algorithm 2 : Update $\boldsymbol{A}$ With $\boldsymbol{D}$ Fixed.

Input: LR-HSI $\boldsymbol{X}$ ; HR-MSI $\boldsymbol{Y}$ ; spectral basis $\boldsymbol{D}$ ; spatial degradation operator $\boldsymbol{H}$ ; spectral transform matrix $\boldsymbol{P}$ ; and regularization parameters ${\eta} _1$ and ${\eta} _2$ .

Initialization: $\boldsymbol{A}\ = \ \boldsymbol{0}$ ; $\boldsymbol{Q}\ = \ \boldsymbol{0}$ ; $\boldsymbol{Z}\ = \ \boldsymbol{0}$ ; $\boldsymbol{V}{_{\boldsymbol{1}}} = \ \boldsymbol{0}$ ; $\boldsymbol{V}{_{\boldsymbol{2}}} = \ \boldsymbol{0}$ ; $\boldsymbol{V}{_{\boldsymbol{3}}} = \ \boldsymbol{0}$ ; $\boldsymbol{U}\ = \ \boldsymbol{0}$ .

While not converge do

Update variables $\boldsymbol{S}$ , $\boldsymbol{Z}$ , $\boldsymbol{Q}$ , ${\rm{and}}\ \boldsymbol{A}$ by using (29).

Update Lagrangian multipliers $\boldsymbol{V}{_{\boldsymbol{1}}}$ , $\boldsymbol{V}{_{\boldsymbol{2}}}$ , ${\rm{and}}\ \boldsymbol{V}{_{\boldsymbol{3}}}$ by using (28).

Update $\mu$ : $\mu \ = \ \rho \mu \ ({\rho > 1})$ .

Update $\boldsymbol{U}$ by using (20).

End while

Output: coefficient matrix $\boldsymbol{A}$ .

E. Optimization of the Spectral Basis With the Coefficients Fixed

In this procedure, we fix the coefficient matrix $\boldsymbol{A}$ . Then, the updating of spectral basis $\boldsymbol{D}$ can be written as $\begin{equation*} \mathop {\min }\limits_{\boldsymbol{D}} \|\boldsymbol{Y} - \boldsymbol{P}\boldsymbol{D}\boldsymbol{A}\|_F^2 + \|\boldsymbol{X} - \boldsymbol{D}\boldsymbol{A}\boldsymbol{H}\|_F^2,\ {\rm{s}}.{\rm{t}}.\,0 \leq \boldsymbol{D} \leq 1. \tag{30} \end{equation*}$ View Source

As the nonlocal spatial similarity prior is mainly reflected by the coefficient matrix $\boldsymbol{A}$ , the constraint term ${\eta _1}\|\boldsymbol{D}\boldsymbol{A} - \boldsymbol{U}\|_F^2$ can be excluded. Similarly, the ADMM can also be used to solve problem (30). More specifically, we introduce $\boldsymbol{W}\ = \ \boldsymbol{D}$ , and obtain the following augmented Lagrangian function: $\begin{align*} L\ \left({\boldsymbol{D,W,}{\boldsymbol{V}_4}} \right) & = \left\| {\boldsymbol{Y} - \boldsymbol{PDA}} \right\|_F^2 + \left\| {\boldsymbol{X} - \boldsymbol{DAH}} \right\|_F^2 \\ &\quad + \mu \left\| {\boldsymbol{W} - \boldsymbol{D} + \frac{{{\boldsymbol{V}_4}}}{{2\mu }}} \right\|_F^2\ \\ & {\rm{s}}{\rm{.t}}.\,0 \leq \boldsymbol{W} \leq 1 \tag{31} \end{align*}$ View Sourcewhere $\boldsymbol{V}{_4}$ is the Lagrangian multiplier. Minimizing the augmented Lagrangian function (31) leads to the following iterations: $\begin{align*} {\boldsymbol{D}^{\left({\boldsymbol{t} + 1} \right)}} = & {\rm{arg}}\mathop {\min }\limits_{\boldsymbol D} L\left({\boldsymbol{D,}{\boldsymbol{W}^{\left(\boldsymbol{t} \right)}},\boldsymbol{V}_4^{\left(\boldsymbol{t} \right)}} \right)\\ {\boldsymbol{W}^{\left({\boldsymbol{t} + 1} \right)}} = & {\rm{arg}}\mathop {\min }\limits_{\boldsymbol W} L\left({{\boldsymbol{D}^{\left(\boldsymbol{t} \right)}},\boldsymbol{W},\boldsymbol{V}_4^{\left(\boldsymbol{t} \right)}} \right). \tag{32} \end{align*}$ View Source

Meanwhile, the Lagrangian multiplier is updated by $\begin{equation*} \boldsymbol{V}_4^{\left({\boldsymbol{t} + 1} \right)} = \boldsymbol{V}_4^{\left(\boldsymbol{t} \right)}\ + \mu \left({{\boldsymbol{W}^{\left({\boldsymbol{t} + 1} \right)}} - {\boldsymbol{D}^{\left({\boldsymbol{t} + 1} \right)}}} \right). \tag{33} \end{equation*}$ View Source

The two subproblems in (32) can be easily solved analytically. For the updating of $\boldsymbol{D}$ , with the auxiliary variable $\boldsymbol{W}$ and the Lagrangian multiplier $\boldsymbol{V}{_4}$ fixed, we can acquire the following equation: $\begin{equation*} {\boldsymbol{D}^{\left({\boldsymbol{t} + 1} \right)}}{\boldsymbol{H}_1} + {\boldsymbol{H}_2}{\boldsymbol{D}^{\left({\boldsymbol{t} + 1} \right)}} = {\boldsymbol{H}_3}\tag{34} \end{equation*}$ View Sourcewhere $\begin{equation*} \begin{array}{l} {\boldsymbol{H}_1} = \left[ {\left({\boldsymbol{AH}} \right){{\left({\boldsymbol{AH}} \right)}^T} + \mu \boldsymbol{I}} \right]\ {\left({\boldsymbol{A}{\boldsymbol{A}^T}} \right)^{ - 1}}\\ {\boldsymbol{H}_2} = {\boldsymbol{P}^T}\boldsymbol{P}\\ {\boldsymbol{H}_3} = \left[ {\boldsymbol{X}{{\left({\boldsymbol{AH}} \right)}^T} + {\boldsymbol{P}^T}\boldsymbol{Y}{\boldsymbol{A}^T} + \mu \left({{\boldsymbol{W}^{\left(\boldsymbol{t} \right)}} + \frac{{\boldsymbol{V}_4^{\left(\boldsymbol{t} \right)}}}{{2\mu }}} \right)} \right]\ {\left({\boldsymbol{A}{\boldsymbol{A}^T}} \right)^{ - 1}}. \end{array} \tag{35} \end{equation*}$ View Source

Then, vectorizing ${\boldsymbol{D}^{({\boldsymbol{t} + 1})}}$ and ${\boldsymbol{H}_3}$ of (34), we can acquire the following equation: $\begin{equation*} \left({\boldsymbol{H}_1^T \otimes \boldsymbol{I} + I \otimes {\boldsymbol{H}_2}} \right){\rm{vec\ }}\left({{\boldsymbol{D}^{\left({\boldsymbol{t} + 1} \right)}}} \right) = \ {\rm{vec}}\left({{\boldsymbol{H}_3}} \right) \tag{36} \end{equation*}$ View Sourcewhere $\otimes$ represents the Kronecker product, and ${\rm{vec}}(\cdot)$ is the vectorization operation. Therefore, ${\boldsymbol{D}^{({\boldsymbol{t} + 1})}}$ can be computed as $\begin{equation*} {\rm{vec\ }}\left({{\boldsymbol{D}^{\left({\boldsymbol{t} + 1} \right)}}} \right) = {\left({\boldsymbol{H}_1^T \otimes \boldsymbol{I} + \boldsymbol{I} \otimes {\boldsymbol{H}_2}} \right)^{ - 1}}{\rm{\ vec}}\left({{\boldsymbol{H}_3}} \right). \tag{37} \end{equation*}$ View Source

For the updating of $\boldsymbol{W}$ , the solution of ${\boldsymbol{W}^{({\boldsymbol{t} + 1})}}$ can be analytically obtained by $\begin{equation*} {\boldsymbol{W}^{\left({\boldsymbol{t} + 1} \right)}} = \ {\rm{min}}\left({{\rm{max}}\left({{\boldsymbol{D}^{\left({\boldsymbol{t} + 1} \right)}} - \frac{{\boldsymbol{V}_4^{\left(\boldsymbol{t} \right)}}}{{2\mu }},0} \right),1} \right). \tag{38} \end{equation*}$ View Source

The update of variables and multiplier is alternately iterated until convergence. The overall algorithm for updating spectral basis $\boldsymbol{D}$ is summarized in Algorithm 3.

Algorithm 3: Update $\boldsymbol{D}$ With $\boldsymbol{A}$ Fixed.

Input: LR-HSI $\boldsymbol{X}$ ; HR-MSI $\boldsymbol{Y}$ ; coefficient matrix $\boldsymbol{A}$ ; spatial degradation operator $\boldsymbol{H}$ ; and spectral transform matrix $\boldsymbol{P}$ .

Initialization: $\boldsymbol{D} = 0$ ; $\boldsymbol{W} = 0$ ; and ${\boldsymbol{V}_4} = 0$ ;

While not converge do

Update variables $\boldsymbol{D}$ and $\boldsymbol{W}$ by using (37) and (38), respectively.

Update Lagrangian multipliers ${\boldsymbol{V}_4}$ by using (33).

Update $\mu$ : $\mu \ = \ \rho \mu \ ({\rho > 1})$ .

End while

Output: spectral basis $\boldsymbol{D}$ .

SECTION IV.

Experimental Results and Discussion

In this section, to evaluate the performance of our proposed HSI super-resolution method, we conduct ample experiments on both ground-based HSIs datasets and real remote sensing HSI. To objectively evaluate the quality of the reconstructed HSIs, we adopt four objective evaluation indices, which are peak signal to noise ratio (PSNR), root-mean-square error (RMSE), relative dimensionless global error in synthesis (ERGAS), and spectral angle mapper (SAM), in our experiments.

A. Experimental Datasets

In our experiments, we use two categories of images to show the effectiveness of our method. For the ground-based HSIs, we use a public HSI dataset, which is named Columbia Computer Vision Laboratory (CAVE) [65]. The CAVE dataset includes 32 HSIs of everyday objects, which are captured by generalized assorted pixel camera with high quality. The spatial size of each HSI in CAVE is $\text{512} \times \text{512}$ . And each HSI has 31 spectral bands ranging from 400 to 700 nm at an interval of 10 nm. Because some images in the CAVE dataset are similar, we select 20 representative HSIs of CAVE as our experimental data, which are shown in Fig. 4. The selected HSIs are served as ground truth images and used to generate the LR-HSIs and HR-MSIs.

Fig. 4.

Total of 20 representative testing images from the CAVE datasets. (a) Oil_painting. (b) Cloth. (c) Fake_and_real_peppers. (d) Balloons. (e) Fake_and_real_food. (f) Beads. (g) CD. (h) Chart_and_stuffed_toy. (i) Egyptain_statue. (j) Face. (k) Fake_and_real_lemon_slices. (l) Fake_and_real_sushi. (m) Feathers. (n) Flowers. (o) Glass_tiles. (p) Paints. (q) Real_and_fake_apples. (r) Sponges. (s) Stuffed_toys. (t) Thread_spools.

Show All

For the real remote sensing HSI, we use three popular remote sensing HSIs: Cuprite Mine Nevada, Indian Pines, and Pavia Center, which are adopted in [51]. The three HSIs are shown in Fig. 5. The wavelength of the Cuprite Mine Nevada image ranges from 400 to 2500 nm at an interval of 10 nm and the spatial resolution of the Cuprite mine Nevada image is 20 m. We crop the top left region of size $\text{512} \times \text{512}$ as the ground truth after abandoning the bands with water absorptions and low SNR. The final size of the ground truth image in our experiment is $\text{512} \times \text{512} \times \text{200}$ . The Indian Pines image is captured by the airborne visible and infrared imaging spectrometer over the northwestern Indiana. The wavelength of Indian Pines ranges from 400 to 2500 nm with an interval of 10 nm. We crop the bottom right part of size $\text{512} \times \text{512}$ and remove the water absorption bands (104–108, 150–163, and 220). The final size of the ground truth image in our experiment is $\text{512} \times \text{512} \times \text{200}$ . Pavia Center image is taken by the reflective optics system imaging spectrometer over the Center of Pavia area. The spectrum of Pavia Center ranges from 430 to 860 nm at an interval of 4 nm. After abandoning the nosiest bands, we crop the bottom right region with the size of $\text{512} \times \text{512} \times \text{102}$ , which is used as the ground truth image in our experiment.

Fig. 5.

Three popular remote sensing HSIs. (a) Cuprite Mine Nevada. (b) Indian Pines. (c) Pavia Center.

Show All

B. Evaluation Indices

In this article, we use four indices to evaluate the reconstruction quality. The first index is PSNR, which is defined as the average PSNR value of all spectral bands. The formulation is as follows: $\begin{equation*} {\rm{PSNR}}\left({\hat{\boldsymbol{Z}},\boldsymbol{Z}} \right) = \frac{1}{S}\mathop \sum \limits_{i\ = {\rm{\ }}1}^S {\rm{PSNR}}\left({{{\hat{\boldsymbol{Z}}}_{\boldsymbol i}}\boldsymbol{,}{\boldsymbol{Z}_{\boldsymbol i}}} \right) \tag{39} \end{equation*}$ View Sourcewhere ${\boldsymbol{Z}_{\boldsymbol i}}$ and ${\hat{\boldsymbol{Z}}_{\boldsymbol i}}$ denote the $i{\rm{th}}$ band of the ground truth HSI $\boldsymbol{Z}$ and the estimated HSI $\hat{\boldsymbol{Z}}$ , respectively. $S$ represents the number of spectral bands. PSNR index measures the similarities between the two images. The larger the PSNR, the better the reconstruction result.

The second index is RMSE, which is defined as the average RMSE of all spectral bands, i.e., $\begin{equation*} {\rm{RMSE\ }}\left({\hat{\boldsymbol{Z},Z}} \right) = \frac{1}{S}\ \mathop \sum \limits_{i\ = {\rm{\ }}1}^S {\rm{RMSE}}\left({{{\hat{\boldsymbol{Z}}}_{\boldsymbol i}}\boldsymbol{,}{\boldsymbol{Z}_{\boldsymbol i}}} \right). \tag{40} \end{equation*}$ View Source

The smaller the RMSE, the better the reconstruction result.

The third index is ERGAS, whose formulation is $\begin{equation*} {\rm{ERGAS\ }}\left({\hat{\boldsymbol{Z},Z}} \right) = \frac{{100}}{c}\ \sqrt {\frac{1}{S}\mathop \sum \nolimits_{i\ = {\rm{\ }}1}^S \frac{{{\rm{MSE}}\left({{{\hat{\boldsymbol{Z}}}_{\boldsymbol i}}\boldsymbol{,}{\boldsymbol{Z}_{\boldsymbol i}}} \right)}}{{\mu _{{{\hat{\boldsymbol{Z}}}_{\boldsymbol i}}}^2}}} \tag{41} \end{equation*}$ View Sourcewhere $c$ represents the spatial downsampling factor, and ${\mu _{{{\hat{\boldsymbol{Z}}}_{\boldsymbol i}}}}$ is the mean value of ${\hat{\boldsymbol{Z}}_{\boldsymbol i}}$ . The smaller the ERGAS, the better the reconstruction result.

The fourth index is SAM, which is defined as $\begin{equation*} {\rm{SAM\ }}\left({\hat{\boldsymbol Z}, {\boldsymbol Z}} \right) = \frac{1}{N}\ \mathop {\sum }_{j\ = \ 1}^N {\cos ^{-1}} \frac {{\hat{\boldsymbol z}_{\boldsymbol j}^{\boldsymbol T}{{\boldsymbol z}_{\boldsymbol j}}}}{{{{\left\| {{\hat{\boldsymbol z}}_{\boldsymbol j}} \right\|}_2}{{\left\| {{{\boldsymbol z}_{\boldsymbol j}}} \right\|}_2}}} \tag{42} \end{equation*}$ View Sourcewhere ${\boldsymbol{z}_{\boldsymbol j}}$ and ${\hat{\boldsymbol{z}}_{\boldsymbol j}}$ denote the $j{\rm{th}}$ pixel of the ground truth HSI $\boldsymbol{Z}$ and the estimated HSI $\hat{\boldsymbol{Z}}$ , respectively. $N$ represents the number of pixels. SAM measures the spectral quality of reconstructed HSI. The smaller the SAM, the better the reconstruction result.

C. Experimental Settings for the Comparison Methods

For the sake of fairness, we describe the experimental settings in this section. The LR-HSI $\boldsymbol{X}$ and the HR-MSI $\boldsymbol{Y}$ are obtained according to the same settings for all the comparison methods.

For the ground-based HSIs, as in [52], the ground truth HR-HSI $\boldsymbol{Z}$ is downsampled by averaging the disjoint $s \times s$ blocks to simulate the LR-HSI $\boldsymbol{X}$ , where $s$ denotes the scaling factor ( $s\ = \ \text{8},\ \text{16},\ {\rm{and\ }}\text{32}$ ). The HR-MSI $\boldsymbol{Y}$ is generated directly downsampling the spectral dimension of $\boldsymbol{Z}$ using the spectral transform matrix $\boldsymbol{P}$ , which is derived from the response of a Nikon D700 camera. ¹

For the real remote sensing HSIs, as the operations in [51], the ground truth HR-HSI $\boldsymbol{Z}$ is downsampled $s$ times ( $s\ = \ \text{8},\ \text{16},\ {\rm{and}}\ \text{32}$ ) to obtain the LR-HSI $\boldsymbol{X}$ . Specifically, each pixel $\boldsymbol{x}_{\boldsymbol{i}} \in \boldsymbol{X}$ is generated by averaging pixels in a $s \times s$ window of HR-HSI $\boldsymbol{Z}$ centering on location $i$ . For the HR-MSI $\boldsymbol{Y}$ , we directly select several bands from the ground truth HR-HSI $\boldsymbol{Z}$ . The Landsat7-like reflectance spectral response filter is used as the spectral transform matrix $\boldsymbol{P}$ . That is, for Cuprite Mine and Indian Pine images, the bands whose center wavelengths are 480, 560, 660, 830, 1650, and 2220 nm will be selected. For the Pavia Center image, we choose 480, 560, 660, and 830 nm (corresponding to the blue, green, red, and near-infrared channel, respectively) of the HR-HSI $\boldsymbol{Z}$ to simulate the HR-MSI $\boldsymbol{Y}$ .

D. Experimental Results of Our Method

In this section, we will show the experimental results of our HSI super-resolution method compared with some typical existing HSI super-resolution methods, including G-SOMP₊ [49], SNNMF [66], CNMF [43], NSSR [52], SSR [51], and OTD [48]. Note that we do not compare our method with SSR in the experiments of the ground-based HSIs. We will explain it in the experimental analysis of real remote sensing HSI. Similarly, we do not compare our method with OTD in the experiments of the ground-based HSIs. It is because that Han et al. [48] only conducts experiments with real remote sensing HSIs. For the sake of fairness, we only compared our method with OTD on the real remote sensing HSIs. In recent years, deep learning based HSI super-resolution methods have shown good results, but this kind of method needs a large number of training samples. Considering that our method do not need any training sample, we do not compare our approach with the deep learning based methods. In our experiments, we assume that the spatial degradation operator $\boldsymbol{H}$ and the spectral transform matrix $\boldsymbol{P}$ are known. In the real-world situation, both the spatial degradation operator $\boldsymbol{H}$ and the spectral transform matrix $\boldsymbol{P}$ can be estimated from the LR-HSI and the HR-MSI. Some major parameters include regularization parameters ${\eta _1}$ and ${\eta _2}$ , and the number of atoms of spectral basis $\boldsymbol{K}$ . The selection of these parameters will be discussed in Section IV-E.

For the ground-based HSIs, the comparison of results is given in Table I. The best results are bolded for clarity. As presented in Table I, our ANSR method achieves the best results among all the compared methods, and the NSSR is the second-best method, although it was proposed in 2016. According to Table I, the large average PSNR gains of our method over the second-best method with $s\ = \ \text{8}$ , $s\ = \ \text{16}$ , and $s\ = \ \text{32}$ are 0.4698, 0.3887, and 0.7768 dB, respectively. The effects of G-SOMP₊ are much worse than the other compared methods. The reason is that the G-SOMP₊ method does not make use of the prior knowledge of the spatial degradation operator $\boldsymbol{H}$ , which is usually unknown and needs to be estimated in practical applications. Fig. 6 shows the PSNR curves of the wavelengths of the spectral bands over the testing image “fake_and_real_food” in the CAVE dataset for the compared methods. It can be seen from Fig. 6 that the proposed ANSR method outperforms other compared methods at each wavelength for all the scaling factors. Meanwhile, in order to further demonstrate the effect of our method, some comparisons of the vision effects can be seen in Figs. 7–9. Figs. 7–9 show the reconstructed HR-HSI at different wavelengths of test images “fake_and_real_peppers,” “egyptian_statue,” and “real_and_fake_apples,” respectively. It can be seen from Figs. 7–9 that our ANSR method has the best visual results and achieves the minimum reconstruction errors.

TABLE I Average Results of the Test Methods for Different Scaling Factors on the Ground-Based HSIs

$Fig. 6. - PSNR curves of all the wavelengths of spectral bands over the testing image “fake_and_real_food.” (a) PNSR curves with scaling factor $s\ = \ 8$. (b) PNSR curves with scaling factor $s\ = \ 16$. (c) PNSR curves with scaling factor $s\ = \ 32$.$

Fig. 6.

PSNR curves of all the wavelengths of spectral bands over the testing image “fake_and_real_food.” (a) PNSR curves with scaling factor $s\ = \ 8$ . (b) PNSR curves with scaling factor $s\ = \ 16$ . (c) PNSR curves with scaling factor $s\ = \ 32$ .

Show All

$Fig. 7. - Reconstructed results of image “fake_and_real_peppers” at 480, 550, and 640 nm with scaling factor $s\ = \ 16$. From top to bottom, the first three rows show the reconstructed images of different methods at 480, 550, and 640 nm, respectively; the last three rows show the errors of different methods at 480, 550, and 640 nm, respectively. (a) Original images. (b) Results of G-SOMP+. (c) Results of CNMF. (d) Results of SNNMF. (e) Results of NSSR. (f) Results of our ANSR. (g) Errors of the original images. (h) Errors of G-SOMP+ results. (i) Errors of CNMF results. (j) Errors of SNNMF results. (k) Errors of NSSR results. (l) Errors of ANSR results.$

Fig. 7.

Reconstructed results of image “fake_and_real_peppers” at 480, 550, and 640 nm with scaling factor $s\ = \ 16$ . From top to bottom, the first three rows show the reconstructed images of different methods at 480, 550, and 640 nm, respectively; the last three rows show the errors of different methods at 480, 550, and 640 nm, respectively. (a) Original images. (b) Results of G-SOMP₊. (c) Results of CNMF. (d) Results of SNNMF. (e) Results of NSSR. (f) Results of our ANSR. (g) Errors of the original images. (h) Errors of G-SOMP₊ results. (i) Errors of CNMF results. (j) Errors of SNNMF results. (k) Errors of NSSR results. (l) Errors of ANSR results.

Show All

$Fig. 8. - Reconstructed results of image “Egyptian_statue” at 480, 550, and 640 nm with scaling factor $s\ = \ 16$. From top to bottom, the first three rows show the reconstructed images of different methods at 480, 550, and 640 nm, respectively; the last three rows show the errors of different methods at 480 nm, 550 nm, and 640 nm, respectively. (a) Original images. (b) Results of G-SOMP+. (c) Results of CNMF. (d) Results of SNNMF. (e) Results of NSSR. (f) Results of our ANSR. (g) Errors of the original images. (h) Errors of G-SOMP+ results. (i) Errors of CNMF results. (j) Errors of SNNMF results. (k) Errors of NSSR results. (l) Errors of ANSR results.$

Fig. 8.

Reconstructed results of image “Egyptian_statue” at 480, 550, and 640 nm with scaling factor $s\ = \ 16$ . From top to bottom, the first three rows show the reconstructed images of different methods at 480, 550, and 640 nm, respectively; the last three rows show the errors of different methods at 480 nm, 550 nm, and 640 nm, respectively. (a) Original images. (b) Results of G-SOMP₊. (c) Results of CNMF. (d) Results of SNNMF. (e) Results of NSSR. (f) Results of our ANSR. (g) Errors of the original images. (h) Errors of G-SOMP₊ results. (i) Errors of CNMF results. (j) Errors of SNNMF results. (k) Errors of NSSR results. (l) Errors of ANSR results.

Show All

$Fig. 9. - Reconstructed results of image “real_and_fake_apples” at 480, 550, and 640 nm with scaling factor $s\ = \ 16$. From top to bottom, the first three rows show the reconstructed images of different methods at 480, 550, and 640 nm, respectively; the last three rows show the errors of different methods at 480, 550, and 640 nm, respectively. (a) Original images. (b) Results of G-SOMP+. (c) Results of CNMF. (d) Results of SNNMF. (e) Results of NSSR. (f) Results of our ANSR. (g) Errors of the original images. (h) Errors of G-SOMP+ results. (i) Errors of CNMF results. (j) Errors of SNNMF results. (k) Errors of NSSR results. (l) Errors of ANSR results.$

Fig. 9.

Reconstructed results of image “real_and_fake_apples” at 480, 550, and 640 nm with scaling factor $s\ = \ 16$ . From top to bottom, the first three rows show the reconstructed images of different methods at 480, 550, and 640 nm, respectively; the last three rows show the errors of different methods at 480, 550, and 640 nm, respectively. (a) Original images. (b) Results of G-SOMP₊. (c) Results of CNMF. (d) Results of SNNMF. (e) Results of NSSR. (f) Results of our ANSR. (g) Errors of the original images. (h) Errors of G-SOMP₊ results. (i) Errors of CNMF results. (j) Errors of SNNMF results. (k) Errors of NSSR results. (l) Errors of ANSR results.

Show All

For the real remote sensing HSI, the comparison of results is given in Tables II–IV. The best results are bolded for clarity. In this part, we compare our ANSR method with SSR and OTD, which are not compared in the experiments of the ground-based HSIs. It is because that the SSR method needs to cluster the HR-MSI into superpixels, whose size and shape are adaptively adjusted according to the local structures. However, the images in the CAVE dataset are always very simple, and there is not much information in them. Therefore, the SSR method will fall into an endless loop because some superpixels only contain invalid information, when the number of superpixels is too large (e.g., 6000, which is used in real remote sensing HSI). But if we reduce the number of superpixels, the results of the SSR method become very poor because it loses its advantage. As presented in Tables II–IV, our ANSR method performs best among all the compared methods and the OTD is the second-best method. According to Table II, the large PSNR gains of our method over the second-best method on the “Cuprite Mine Nevada” with $s\ = \ \text{8}$ , $s\ = \ \text{16}$ , and $s\ = \ \text{32}$ are 0.4393 dB, 0.9297 dB, and 1.0392 dB, respectively. According to Table III, the large PSNR gains of our method over the second-best method on the “Indian Pines” with $s\ = \ \text{8}$ , $s\ = \ \text{16},$ and $s\ = \ \text{32}$ are 0.5321 dB, 0.3751 dB, and 2.7559 dB, respectively. According to Table IV, the large PSNR gains of our method over the second-best method on the “Pavia Center” with $s\ = \ \text{8}$ , $s\ = \ \text{16},$ and $s\ = \ \text{32}$ are 1.1518 dB, 0.2425 dB, and 1.4288 dB, respectively. Also, we can find that the SSR method performs better than the NSSR method on “Cuprite Mine Nevada” and “Pavia Center,” but the opposite is true on the “Indian Pines.” It is because that the “Indian Pines” image is relatively simple and contains less surface features information. In Fig. 10, we show the reconstructed HR-HSI at different wavelengths of test images “Pavia Center.” It can be seen from Fig. 10 that the proposed ANSR method has the best visual results and achieves the minimum reconstruction errors.

TABLE II PSNR, RMSE, SAM, and ERGAS Results of the Test Methods for Different Scaling Factors on Cuprite Mine Nevada

TABLE III PSNR, RMSE, SAM, and ERGAS Results of the Test Methods for Different Scaling Factors on Indian Pines

TABLE IV PSNR, RMSE, SAM, and ERGAS Results of the Test Methods for Different Scaling Factors on Pavia Center

$Fig. 10. - Reconstructed results of image “Pavia Center” at ${\rm{the\ }}\text{25th}$, $\text{55th}$, and $\text{85th}$ bands with scaling factor $s\ = \ 8$. From top to bottom, the first three rows show the reconstructed images of different methods at $\text{25th}$, $\text{55th}$, and $\text{85th}$ bands, respectively; the last three rows show the errors of different methods at $\text{25th}$, $\text{55th}$, and $\text{85th}$ bands, respectively. (a) Original images. (b) Results of G-SOMP+. (c) Results of CNMF. (d) Results of SSR. (e) Results of NSSR. (f) Results of OTD. (g) Results of our ANSR. (h) Errors of the original images. (i) Errors of G-SOMP+ results. (j) Errors of CNMF results. (k) Errors of SSR results. (l) Errors of NSSR results. (m) Errors of OTD. (n) Errors of ANSR results.$

Fig. 10.

Reconstructed results of image “Pavia Center” at ${\rm{the\ }}\text{25th}$ , $\text{55th}$ , and $\text{85th}$ bands with scaling factor $s\ = \ 8$ . From top to bottom, the first three rows show the reconstructed images of different methods at $\text{25th}$ , $\text{55th}$ , and $\text{85th}$ bands, respectively; the last three rows show the errors of different methods at $\text{25th}$ , $\text{55th}$ , and $\text{85th}$ bands, respectively. (a) Original images. (b) Results of G-SOMP₊. (c) Results of CNMF. (d) Results of SSR. (e) Results of NSSR. (f) Results of OTD. (g) Results of our ANSR. (h) Errors of the original images. (i) Errors of G-SOMP₊ results. (j) Errors of CNMF results. (k) Errors of SSR results. (l) Errors of NSSR results. (m) Errors of OTD. (n) Errors of ANSR results.

Show All

According to the above experimental discussion, we can find that the larger the scaling factor, the more obvious the advantage of our method. In our experiments, the PSNR values of our method have the most obvious advantage over the second-best method when scaling factor $s\ = \ \text{32}$ .

E. Parameters Selection in Our Method

In our HSI super-resolution method, there are three crucial parameters, i.e., regularization parameters ${\eta _1}$ and ${\eta _2}$ , and the number of atoms of spectral basis $\boldsymbol{K}$ . To evaluate the sensitivity of the three parameters, we conduct plenty of experiments for different values of them.

First, we all know that the number of atoms of the basis is very important in a sparse coding-based method. We perform some experiments with different values of $\boldsymbol{K}$ . Fig. 11 plots the PSNR curves of the ground-based HSIs, Cuprite Mine Nevada, Indian Pines, and Pavia Center as functions of the number of atoms $\boldsymbol{K}$ . It can be seen from Fig. 11 that the four PSNR curves increase with slight fluctuation when the number of atoms $\boldsymbol{K}$ becomes larger from 10 to 120. But the change curves of the ground-based HSIs, Cuprite Mine Nevada, and Indian Pines tend to be flat when $\boldsymbol{K}$ is larger than 50. The change curve of Pavia Center tends to be flat when $\boldsymbol{K}$ is larger than 80. As the computational burden increases sharply with the increase of $\boldsymbol{K}$ , we decide to set $\boldsymbol{K}\ = \ \text{80}$ .

$Fig. 11. - PSNR curves of the ground-based HSIs, Cuprite Mine Nevada, Indian Pines, and Pavia Center as functions of the number of atoms $\boldsymbol{K}$.$

Fig. 11.

PSNR curves of the ground-based HSIs, Cuprite Mine Nevada, Indian Pines, and Pavia Center as functions of the number of atoms $\boldsymbol{K}$ .

Show All

Second, we test and verify the effect of ${\eta _1}$ on the reconstructed results. We make $\log {\eta _1}$ (log is base 10) range from −4 to −1. Fig. 12 plots the PSNRs of the ground-based HSIs, Cuprite Mine Nevada, Indian Pines, and Pavia Center as functions of $\log {\eta _1}$ . We can see from Fig. 12 that all the PSNR curves increase as $\log {\eta _1}$ increases from −4 to −2, and then they decrease as $\log {\eta _1}$ increases. Therefore, we choose $\text{1} \times {\text{10}^{ - 2}}$ as the optimal value of ${\eta _1}$ .

$Fig. 12. - PSNRs of the ground-based HSIs, Cuprite Mine Nevada, Indian Pines, and Pavia Center as functions of $\log {\eta _1}$ with different scaling factors. (a) PNSR curves with scaling factor $s\ = \ \text{8}$. (b) PNSR curves with scaling factor $s\ = \ \text{16}$. (c) PNSR curves with scaling factor $s\ = \ \text{32}$.$

Fig. 12.

PSNRs of the ground-based HSIs, Cuprite Mine Nevada, Indian Pines, and Pavia Center as functions of $\log {\eta _1}$ with different scaling factors. (a) PNSR curves with scaling factor $s\ = \ \text{8}$ . (b) PNSR curves with scaling factor $s\ = \ \text{16}$ . (c) PNSR curves with scaling factor $s\ = \ \text{32}$ .

Show All

Third, we test and verify the effect of ${\eta _2}$ on the reconstructed results. We make $\log {\eta _2}$ (log is base 10) range from −5 to −2. Fig. 13 plots the PSNRs of the ground-based HSIs, Cuprite Mine Nevada, Indian Pines, and Pavia Center as functions of $\log {\eta _2}$ . As can be seen from Fig. 13 that as the $\log {\eta _2}$ increases from −5 to −4, all the PSNR values (except for the PSNR curve of Cuprite Mine Nevada with scaling factor $s\ = \ \text{32}$ , which always decreases as $\log {\eta _2}$ increases from −5 to −2) increase. Then, the PSNRs decrease as $\log {\eta _2}$ increases. Thus, we set ${\eta _2} = \ 1 \times {10^{ - 4}}$ in our experiment.

$Fig. 13. - PSNRs of the ground-based HSIs, Cuprite Mine Nevada, Indian Pines, and Pavia Center as functions of $\log {\eta _2}$ with different scaling factors. (a) PNSR curves with scaling factor $s\ = \ 8$. (b) PNSR curves with scaling factor $s\ = \ 16$. (c) PNSR curves with scaling factor $s\ = \ \text{32}$.$

Fig. 13.

PSNRs of the ground-based HSIs, Cuprite Mine Nevada, Indian Pines, and Pavia Center as functions of $\log {\eta _2}$ with different scaling factors. (a) PNSR curves with scaling factor $s\ = \ 8$ . (b) PNSR curves with scaling factor $s\ = \ 16$ . (c) PNSR curves with scaling factor $s\ = \ \text{32}$ .

Show All

SECTION V.

Conclusion

In this article, we presented a novel sparse representation-based HSI super-resolution method, termed ANSR, to fuse an LR-HSI and its corresponding HR-MSI. In the base of the NSSR model, we introduce the ASR, which can balance the relationship between the sparsity and collaboration by generating a suitable coefficient, into our ASNR method. Also, we design an alternative optimization algorithm to optimize the spectral basis rather than keeping it fixed. ADMM method is applied to solve the proposed optimization problem. In order to show the performance of the proposed method, we conduct plenty of experiments. The experimental results on both ground-based HSI dataset and real remote sensing HSIs show the superiority of our proposed approach to some other state-of-the-art HSI super-resolution methods.

In further work, we aim to improve the method in several directions. We will focus on the estimation of the spatial degradation operator $\boldsymbol{H}$ and the spectral transform matrix $\boldsymbol{P}$ , and conduct experiments of blind fusion, which is more common in the real-world situation. Then, we will optimize our solution process to improve the computational efficiency.

References is not available for this document.

Adaptive Nonnegative Sparse Representation for Hyperspectral Image Super-Resolution

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

Introduction