Journals & Magazines >IEEE Access >Volume: 8

Multi-Scale Infrared Small Target Detection Method via Precise Feature Matching and Scale Selection Strategy

The overview of the proposed Multi-scale infrared small target detection method.

Abstract:

Infrared small target detection is a crucial and challenging topic for various applications. In recent years, the spectrum scale space (SSS) algorithm has shown considera...Show More

Metadata

Abstract:

Infrared small target detection is a crucial and challenging topic for various applications. In recent years, the spectrum scale space (SSS) algorithm has shown considerable potential in the field of target detection. However, the SSS algorithm is prone to high false alarm rates in infrared small target detection scenarios with complex background. This paper proposes an improved SSS (ISSS) algorithm via precise feature matching and scale selection strategy for efficient infrared small target detection, which includes background suppression, feature matching and optimal scale selection three stages. In the background suppression stage, a matrix decomposition method named inexact augmented Lagrange multiplier (IALM) algorithm is used to extract the sparse image matrix from the original image as the target foreground image. In the feature matching stage, the 16 elaborate Gaussian kernel functions convolve with the the amplitude spectrum of target foreground image to generate 16 scale saliency maps that precisely match the feature of small targets. In the optimal scale selection stage, a few proper candidate scale maps are screened out according to the difference between the pixel values of the target area and the background clutters, in which the target area was more highlighted, and the scale map corresponding to the maximum value of local information entropy of the candidate saliency map is the final detection result map. We mainly made three contributions: First, IALM algorithm is utilized as a preprocessing step, and we have verified it is indispensable in eliminating most backgrounds with self-correlation property. Second, an elaborate scale division strategy is proposed to obtain multi-scale saliency maps that match the feature of infrared small targets precisely. Third, the gray value difference and the maximum value of local information entropy are defined and used as the judgment criteria for optimal scale selection. Extensive experimental results demonstrate ...

The overview of the proposed Multi-scale infrared small target detection method.

Published in: IEEE Access ( Volume: 8)

Page(s): 48660 - 48672

Date of Publication: 28 February 2020

Electronic ISSN: 2169-3536

DOI: 10.1109/ACCESS.2020.2976805

Funding Agency:

Contents

CCBY - IEEE is not the copyright holder of this material. Please follow the instructions via https://creativecommons.org/licenses/by/4.0/ to obtain full-text articles and stipulations in the API documentation.

SECTION I.

Introduction

Infrared small target detection remains a challenging issue with the rapid development of infrared guidance systems. Small targets are often submerged in nonconstant complex backgrounds with low signal-noise ratios and low contrast. Moreover, infrared small targets always have unremarkable features, uncertain brightness, and weak intensity because of the long imaging distance in the atmosphere [1], [2]. Researchers have exerted considerable efforts in the past decade, but infrared small target detection is still a challenging task worth exploring [3]–[6].

In general, infrared small target detection methods can be classified into two categories: single frame and sequential detection. Sequential detection methods, such as the interframe difference method [7], optical flow method [8], [9], three-dimensional directional filtering [10], and Bayesian theory [11], perform well when the target has prior knowledge of the shape and position in adjacent frames. However, obtaining prior knowledge in practical military applications is extremely difficult. Considering fast detection speed and short initialization time [12], researchers often focus on single frame detection.

Typical single frame image detection methods, such as the maximum mean and maximum median filters [13], [14], two-dimensional minimum mean square filter [15], background regression estimation method [16], morphological method [17], and bilateral filter [18], can effectively detect targets in simple background. However, when small targets are submerged in infrared scenarios with highly heterogeneous backgrounds, these algorithms fail to obtain satisfactory detection results. Recently, matrix decomposition and the saliency detection method have shown substantial advantages in single frame infrared images detection. For the typical matrix decomposition method, the robust principal component analysis (RPCA) method [19] based on convex optimization is used to separate the foreground target matrix and the background matrix accurately from the original infrared image. The infrared patch-image (IPI) model [20] generalizes the traditional image model to a new patch-image model based on local patch construction. The core idea of the IPI model is to split an infrared image into the patch-image model, separate the foreground target matrix from the background matrix by stable principle component analysis, and finally reconstruct the image. However, some residual background edges remain when the target is submerged in heavy noise due to the defect of the $l_{1}$ -norm-based sparsity measure in the IPI model. A reweighted infrared patch-image model is proposed [21] to overcome the defect that the nuclear norm in the IPI model could easily leave many sparse background edges. The weighted nuclear norm minimization based matrix completion (WNNM-MC) model [22] assigns weights adaptively on different singular values to overcome the defect of regularizing each singular value equally in practical problems.

In recent years, methods based on the contrast mechanism of saliency detection have been proposed in the open literature for infrared small target detection. These methods consider noticeable differences between targets and background regions [23]–[26]. In the meanwhile, numerous saliency detection methods have been proposed in the field of small target detection. Spectral residual (SR) [27] is a typical algorithm for saliency detection that extracts the target region by utilizing the spectral residual information in the spectral domain and finally obtaining the salient target region. However, the SR algorithm focuses on target extraction and lacks suppression ability, especially on infrared images with heavy noises and highly heterogeneous backgrounds. The phase spectrum of quaternion Fourier transform (PQFT) [28] is proposed to calculate spatiotemporal saliency maps, which validates the necessity of phase spectra in saliency detection. A new bottom-up paradigm named spectrum scale space (SSS) algorithm [29], which is based on scale space analysis, is proposed for saliency detection. SSS analyzes the infrared image at a multiscale level and obtains different scale maps with various detailed information. However, for small target detection in complex backgrounds, SSS often detects not only target regions but also background edges. In this case, a dual multiscale filter [30] with SSS and Gabor wavelets (GW) is proposed for efficient infrared small target detection. SSS is used as the preprocessing procedure to obtain the multiscale saliency maps, and GW is utilized to suppress the high frequency noise remained and next, non-negative matrix factorization method fuses all the GW maps into one final detection image. Some saliency detection methods based on local contrast have been proposed in the open literature. Typically, local contrast measure (LCM) [31] utilizes the characteristics of HVS and a derived kernel model to highlight target regions well. However, the LCM algorithm needs to conduct numerous pixel-by-pixel calculations; thus, the efficiency is relatively low. A multiscale algorithm utilizing the relative local contrast measure (RLCM) [32] is proposed for infrared small target detection. The raw infrared image is first calculated by multiscale RLCM and then an adaptive threshold is utilized to extract target area. Recently, some LCM-based methods have shown considerable potential in target detection area. For example, a fast target detection method guided by visual saliency (TDGS) [33] is proposed, which contains fast SSS as the coarse-detection stage and adaptive LCM as the fine-detection stage. TDGS performs well for small and dim target detection in infrared search and track (IRST) systems and practical applications. A coarse to fine (CF) [34] framework combining the matrix decomposition method with multiscale modified LCM (MLCM) detects small target from structured edges, unstructured clutter and noise gradually. Inspired by the multiscale gray difference and local entropy operator, a small target detection method based on the novel weighted image entropy (NWIE) [35] is proposed, which performs well to process small target images with low SNR.

This study proposes an improved SSS (ISSS) algorithm via precise feature matching and scale selection strategy for infrared small target detection. The inexact augmented Lagrange multiplier (IALM) by exploiting the nonlocal self-similarity prior is set as the preprocessing step to extract the sparse image matrix from the original infrared image matrix. After the candidate target region is extracted from the first step, most backgrounds with self-correlation property are suppressed. Then, the improved SSS (ISSS) algorithm is proposed as the postprocessing step to further enhance the target contrast and eliminate the highly heterogeneous backgrounds via the proposed elaborate scale division strategy and optimal scale selection mechanism. The ISSS algorithm generates multi-scale saliency maps by the proposed scale division strategy, which caters to the features of infrared small targets. Finally, the optimal scale map where the small target is most highlighted needs to be screened out. Combined with the idea of saliency measurement [36], [37], the gray value difference and the maximum value of local information entropy are defined and used as the judgment criteria for optimal scale selection. Extensive experiments demonstrate that the proposed algorithm not only performs satisfactorily in visual and quantitative evaluations, but also outperforms other contrast algorithms, especially on infrared images with thick clouds and high-brightness buildings. Real experimental data also indicate that IALM and ISSS are essential steps for the proposed method and that the execution order cannot be changed in order to achieve a high detection performance. Given these effective improvements, the proposed algorithm is robust and efficient against occlusion and complex noise for infrared small target detection.

This paper is organized as follows. Section 2 reviews the related work. Section 3 introduces the proposed algorithm, including the IALM algorithm and the ISSS algorithm. Section 4 shows the experimental results. Lastly, Section 5 presents the conclusion.

SECTION II.

Related Work

In this section, we review the basic concepts of matrix decomposition and saliency detection for infrared small target detection.

A. Matrix Decomposition

Inspired by the non-local self-correlation configuration of the background [20], matrix decomposition method has been turned out to be an effective and accurate method to separate the small target from the original image. The society of photo-optical instrumentation engineers (SPIE) defines a target with the pixel points of no more than $9\times 9$ , which accounts for no more than 0.12% of an image sized $256\times 256$ , as a weak and small target. “Weak” means that the pixel contrast of the target is low, and “small” refers to the small number of pixels occupied by the target. Thus, the target $I_{T}$ is often regarded as a sparse matrix, which is represented as:

$\begin{equation*} \left \|{ I_{T} }\right \|_{0} < k,\tag{1}\end{equation*}$ View Source

where

$\left \|{ \cdot }\right \|_{0}$

denotes the

$l_{0}$

-norm of a matrix, which is the number of non-zero elements in the matrix. The parameter

$k$

depends on the size of the target.

According to a general low-rank assumption previously proposed, all background patches come from a mix of low-rank subspace clusters [38], and infrared background $I_{B}$ is considered as a low-rank matrix:

$\begin{equation*} rank ~(I_{B}{) \le } r,\tag{2}\end{equation*}$ View Source

where the parameter

$r$

is proportional to the complexity of the image. The noise in infrared image usually includes photon noise, johnson noise, color noise, 1/

$f$

noise, particle noise and so on. But from the principle of generation, they are all independent. All the noise obeys the Gaussian distribution with the mean value of 0 except the 1/

$f$

noise. In general, an infrared image can be regarded as the superposition of the target, the background and the noise [39], which can be represented as:

$\begin{equation*} I =I_{T}+I_{B}+I_{n}.\tag{3}\end{equation*}$

View Source

where

$I$

represents the original infrared image,

$I_{T}$

represents the target matrix with sparse property,

$I_{B}$

represents the background matrix with low-rank property, and

$I_{n}$

represents the random noise. The matrix decomposition method has been proved to be effective in separating the target matrix with sparse property from the original infrared image. Principal component analysis (PCA) [40], as a classical data dimension reduction method, aims to re-describe the new high dimensional data space using another set of low dimensional bases. The principal component can also be understood as the projection of high dimensional data on the low dimensional subspace. PCA can remove noise and redundancy to the greatest extent and is widely used in science and engineering applications. Given a large image matrix

$D$

, it is often low-rank or approximate low-rank. The function of PCA is to find a low-rank matrix

$L$

, so that

$L$

becomes the principal element of

$D$

. PCA decomposes the matrix

$D$

into a matrix

$L$

and a matrix

$E$

.When the elements of matrix

$E$

are subject to independent and identical distribution, they are represented as the following optimization problem:

$\begin{equation*} \mathop {\mathrm {min}}\limits _{L,E} \left \|{ L }\right \|_{F}\mathrm {, }~subject ~to~rank~\left ({L }\right)\le r,D =L +E\tag{4}\end{equation*}$

View Source

where

$D$

is the original data matrix,

$E$

is the error matrix, rank(

$L$

) represents the rank of the matrix

$L$

and

$\left \|{ \cdot }\right \|_{F}$

is the Frobenius norm. The optimal solution of the problem can be obtained by singular value decomposition of the matrix

$D$

. However, when the

$E$

matrix is the sparse and large noise, PCA cannot give the ideal result. Robust PCA method performs well in this case. The iterative thresholding technique [41] is a typical algorithm to solve matrix decomposition problem, but its convergence rate is very slow. The algorithm usually requires 10⁴ iterations to converge, and the cost of each iteration is equal to that of one singular value decomposition. In order to improve the efficiency of the algorithm, Lin et al. [42] proposed the accelerated proximal gradient (APG) algorithm and gradient-ascent algorithm. Both algorithms converge at a speed 50 times faster than iterative thresholding technique. On the basis of augmented Lagrange multiplier (ALM), exact ALM method (EALM) algorithm [43] is proposed. The solution obtained by EALM algorithm can converge to the exact solution of the optimization problem, and the Q-linear convergence rate is better than the iterative threshold algorithm and APG algorithm mentioned above. EALM algorithm obtains the more accurate non-zero number of

$E$

matrix, and the accuracy is high. However, the solution of optimization problem

$L$

(L, E, Y_k,

$\mu _{k}$

) needs to be solved by alternating direction method in every iteration of EALM algorithm, which is shown in (5):

$\begin{equation*} \left ({L_{k+1},E_{k+1} }\right)=\arg \mathop {\mathrm {min}}\limits _{L, E}{L (L,E,Y_{k},\mu _{k})},\tag{5}\end{equation*}$

View Source

Nevertheless, solving this sub-problem exactly is proved to be time-consuming and unnecessary. Then the Inexact ALM [43] algorithm (IALM) is proposed, which is the improved EALM algorithm. IALM algorithm converges as fast as EALM, but the number of partial SVDs is less. Consequently, IALM algorithm is widely used to solve the constrained convex optimization problem.

B. Saliency Detection

Human eyes can capture the prominent area of an image quickly through contrast mechanism, which helps eyes to selectively focus on the interested salient area. This attention selection mechanism, which is also called visual saliency, can select the region of interest from all the information in the visual range. Many algorithms of saliency detection refer to the idea of visual saliency and extract the salient region from the whole image effectively. Spectral residual (SR) [27] is a typical saliency detection method, that extracts the saliency region by using the spectral residual information.

Given an original image $I$ , the log-amplitude spectrum $L(u$ , $v$ ) of $I$ in the frequency domain is represented as:

$\begin{equation*} L(u,v) = \mathrm {log}(\vert fft(I) \vert),\tag{6}\end{equation*}$ View Source

The spectral residual is defined as the difference between the original amplitude spectrum and the smoothed amplitude spectrum:

$\begin{equation*} R(u,v)=L(u,v\mathrm {)-}h{^\ast }L(u,v),\tag{7}\end{equation*}$ View Source

where

$h$

denotes the mean filter. Then the inverse Fourier transform converts spectral residual to time domain:

$\begin{equation*} S(x,y)=ifft\{\mathrm {exp}(R(u,v)+i\mathrm {\cdot }P(u,v\mathrm {))\}}.\tag{8}\end{equation*}$

View Source

where

$P(u,v)=\mathrm {angle}(fft(I\mathrm {))}$

is the phase spectrum of the original image. Inspired by the SR model, researchers have noted the frequency domain characteristics of the images. For instance, SSS algorithm is a classical bottom-up framework based on SR model. It is worth mentioning that the convolution of the image amplitude spectrum with a low-pass Gaussian kernel of the appropriate scale is equivalent to an image saliency detector. SSS algorithm suppresses the repetition mode by smoothing the amplitude spectrum with a multi-scale low-pass filter, and is proved to be an effective algorithm for saliency detection. Recently, the saliency detection methods have been applied more and more to the infrared small target detection field.

SECTION III.

Proposed Method

The flowchart with a pipeline structure of the proposed method, including the background suppression stage, feature matching stage and optimal scale selection stage, is shown in Figure 1. In the background suppression stage, IALM algorithm is used to decompose the original image matrix $D$ as a low-rank matrix $L$ and a sparse matrix $E$ . The low-rank matrix $L$ is considered as the background image, and the sparse matrix $E$ is considered as the target foreground image. In the feature matching stage, the 16 elaborate Gaussian kernel functions convolve with the amplitude spectrum of the sparse matrix $E$ to generate 16 scale saliency maps that precisely match the feature of small targets. In the optimal scale selection stage, a few candidate saliency maps are screened according to the difference of gray values between the target area and the residual noise area, and the target area in these maps is assigned a high value while the other parts are greatly suppressed. Finally, the optimal saliency map (the final detection result) is obtained corresponding to the maximum local information entropy of the candidate maps above. These steps are described in detail in the following sections.

FIGURE 1.

Flowchart of the proposed method. The red boxes represent the real target areas, the yellow and green boxes represent the different highlighted residual noise areas in some saliency maps.

Show All

A. Background Suppression Stage

The matrix decomposition method is set as the preprocessing step to decompose the target foreground matrix with sparse characteristics from the infrared image matrix. RPCA, a typical matrix decomposition method, can recover essential low-rank data from observation data with large and sparse noise pollution. This problem can be solved by the following optimization problems:

$\begin{equation*} \mathop {\mathrm {min}}\limits _{L,E} {rank}\left ({L }\right)+\lambda \left \|{ E }\right \|_{0} \quad s.t.~D =L +E,\tag{9}\end{equation*}$ View Source

where the regular parameter

$\lambda$

is greater than 0. Optimizing the solution is difficult because the rank and

$l_{0}$

-norm of the matrix have nonconvex and nonsmooth properties in the optimization. The nuclear norm and

$l_{1}$

-norm [44] of the matrix are their optimal convex approximation; thus, the NP-hard problem of (9) is represented as the following convex optimization problem:

$\begin{equation*} \mathop {{\mathrm {min}}}\limits _{L,E} {\left \|{ L }\right \|_{\mathrm {\ast }}}+\lambda \left \|{ E }\right \|_{1}\quad s.t.~D =L +E,\tag{10}\end{equation*}$

View Source

where

$\left \|{ \cdot }\right \|_{\ast }$

represents the nuclear norm of the matrix and

$\left \|{ \cdot }\right \|_{1}$

represents the

$l_{1}$

-norm of the matrix. The matrix recovery problem depends on the optimization of nuclear norm and

$l_{1}$

-norm.

Considering the efficiency and robustness of the abovementioned IALM algorithm, we use it to decompose the original image matrix to a low-rank image matrix and a sparse image matrix, which correspond to the background region and the target region, respectively, in the preprocessing step. The Lagrange multiplier method is often used to solve the constrained convex optimization problem. This technique integrates the original function and constraint conditions into an unconstrained equation to solve this optimization problem. For (10), the Lagrange multiplier method can perform the following integration:

$\begin{align*} X=&(L,E) \\ f\left ({X }\right)=&\left \|{ L }\right \|_{\ast }+\lambda \left \|{ E }\right \|_{1}. \\ h\mathrm {(X)}=&D -L -E\tag{11}\end{align*}$ View Source

Then the augmented Lagrange function is as follows:

$\begin{align*}&\hspace {-0.5pc}\mathrm {L}(L, E, Y, \mu)=\|L\|_{*}+\lambda \|E\|_{1}+\langle Y, D-L-E\rangle \\&\qquad\qquad\qquad\qquad\qquad\;\;\qquad\displaystyle {+\frac {\mu }{2}\|D-L-E\|_{\mathrm {F}}^{2},} \tag{12}\end{align*}$ View Source

where

$\langle A, B\rangle =\mathrm {tr}\left \langle{ A^{T} B}\right \rangle$

. The specific algorithm flow of IALM is as follows:

Algorithm 1 (Matrix Decomposition by IALM Algorithm)

Input:

Observation matrix $D\in \mathbf {R}^{m\times n},\lambda$

$Y_{0}=D / {J\left ({D }\right)}\mathrm {;}{E}_{0}=\mathrm {0;}\mu _{0}\mathrm {>0; }\rho \mathrm {>1;}k=\mathrm {0}$

while not converged do

$\left ({U,S,V }\right)=svd(D-E_{k}+\mu _{k}^{-1}Y_{k})$

// Fix the others and update ${L}$ by

$L_{k+1}=US_{\mu _{k}^{-1}}\left [{ S }\right]V^{T}$

// Fix the others and update $E$ by

$E_{k+1}=S_{\lambda \mu _{k}^{-1}}\left [{ D-L_{k+1}+\mu _{k}^{-1}Y_{k} }\right]$

$Y_{k+1}=Y_{k}+\mu _{k}(D-L_{k+1}-E_{k+1})$

$\mu _{k+1} = \rho \mu _{k}$

$k=k+1$

end while

Output:

( $L_{k}$ , $E_{k}$ )

The initial parameters need to be set in the algorithm: $\rho =1.6$ , $\lambda =\mathrm {1/}\sqrt {\mathrm {max}(m,n)}$ , $\mu _{0}=\mathrm {1.25/max}(svd(D\mathrm {))\times }{10}^{7}$ , $Y_{0}=D\mathrm {/(max}(mm,im\mathrm {))}$ , $mm=\mathrm {max}(svd(D\mathrm {))}$ , $im=\max {\mathrm {(abs(}D\mathrm {))}}\mathrm {/}\lambda$ .

The following formula updates the parameter $\mu$ :

$\begin{equation*} \mu _{k+1}=\begin{cases} \rho \mu _{k},&if~\dfrac {\mu _{k}\left \|{ E_{k+1}-E_{k} }\right \|_{F}}{\left \|{ D }\right \|_{\mathrm {F}}} < \varepsilon \\ \mu _{k}, &\mathrm {otherwise}. \end{cases}\tag{13}\end{equation*}$ View Source

With the iteration of the algorithm and the continuous updating of $\mu _{k}$ , a fast growth rate results in a fast algorithm convergence speed.

Figure 2 shows the experimental results of three typical infrared small target images after the implementation of the IALM algorithm. The experimental results in Figure 2 are all sparse image matrices extracted from the original image. The results show that the IALM algorithm can effectively eliminate most backgrounds with self-correlation property and enhance the target contrast. In Figure 2, the target areas are labeled with red circles and the noise areas are labeled with the yellow circles.

FIGURE 2.

Extraction of sparse target foreground image by IALM.

Show All

In Figure 2, although most backgrounds are suppressed, the detection results still show a high false alarm rate and poor location precision. Moreover, the target intensity from the three-dimensional maps is considerably weak. ISSS algorithm needs to be used for follow-up processing to further suppress the residual background edges and enhance target contrast.

B. Feature Matching Stage

Studying the different properties of an object often requires different specific scales. For instance, when we study the flying trajectory of an eagle, the eagle is only a small point in our visual range. When we explore its shape, we even observe its feathers. The scale must be introduced as a free parameter variable into the image processing to analyze the areas of interest. The so-called scale space is used to obtain the optimal scale of the target through multiple scales without knowing the image size. On the basis of the SSS algorithm and the contrast mechanism, the target foreground image extracted in the background suppression stage is further processed by ISSS algorithm. According to [29], the convolution of the image amplitude spectrum and the low-pass Gaussian kernel with an appropriate scale is equivalent to image saliency detection; thus, we use a Gaussian kernel function with different scales to obtain the saliency maps of various scales. For selecting the optimal scale factor of the Gaussian kernel function, an optimal-scale selection mechanism is proposed in accordance with the property of local information entropy. This mechanism caters to the characteristics of infrared small targets [45], [46] and limits the calculation of information entropy to the small region of interest instead of traversing all the complete scale maps. Gaussian kernels are the only ones that can produce multiscale spaces [47]. Using linear scale space representation for reference, we generate a single parameter family of smooth spectrum, whose parameters depend on the scale of the Gaussian kernel.

The specific process of the ISSS algorithm is as follows. Given target foreground matrix $I(x,y)$ , the log-amplitude spectrum $I_{A}(u,v)$ and phase spectrum $I_{P}(u,v)$ are represented as follows:

$\begin{align*} I_{A}\left ({u,v }\right)=&\mathrm {log\vert }fft(I(x,y\mathrm {))\vert },\tag{14}\\ I_{P}\left ({u,v }\right)=&angle(fft(I(x,y\mathrm {)))}.\tag{15}\end{align*}$ View Source

The scale space $\Phi (u,v\mathrm {;}k)$ is defined as the convolution of $I_{A}(u,v)$ with a series of Gaussian kernel functions:

$\begin{equation*} \Phi \left ({u,v\mathrm {;}k }\right)=g\left ({u,v\mathrm {;}\sigma }\right)\mathrm {^\ast }I_{A}(u,v),\tag{16}\end{equation*}$ View Source

where

$g(u,v$

;

$\sigma$

) is gaussian kernel and its standard deviation

$\sigma$

is related to scale factor

$k$

$\begin{equation*} \sigma =\begin{cases} 2k &\mathrm {0 < }k\le 5\\ k\ast \left ({k-4 }\right)&\mathrm {5 < }k\le 11 \\ \mathrm {25+2\wedge (}k-\mathrm {6)}&\mathrm {11 < k\le 16} \\ \end{cases}\tag{17}\end{equation*}$

View Source

The step value of the scale parameter $\sigma$ of the Gaussian kernel function is set as an irregular value in (17). When the value of $k$ is small, the standard deviation $\sigma$ of the Gaussian kernel function varies slowly, and when the value of $k$ is large, and $\sigma$ varies rapidly. This elaborate scale division strategy is conducive to choosing an appropriate and accurate Gaussian kernel for small targets. Different types of salient regions require varied filter dimensions. A large background area with a uniform pattern requires a suitable scale to smooth the amplitude spectrum for suppression. An excessively small or large scale selection may cause the background area to become suppressed insufficiently or cause only the salient area edge to be highlighted. When a small-scale nucleus is used, the large area is salient. A large-scale nucleus is used to detect long-range or texture-rich targets [29]. Infrared small targets flying in the sky are usually classified as long-range targets, and their pixels are few; thus, an elaborate scale division strategy is necessary to select the optimal saliency map.

The obtained smooth logarithmic amplitude spectrum $\Phi (u, v; k)$ and the original phase spectrum $I_{P}(u$ , $v$ ) are combined to calculate the inverse Fourier transform and gain the saliency maps $S_{k}(x,y)$ :

$\begin{equation*} S_{k}\left ({x,y }\right)=ifft\{\mathrm {exp}(\Phi \left ({u,v\mathrm {;}k }\right)+i\mathrm {\cdot }I_{P}(u,v\mathrm {}))\}.\tag{18}\end{equation*}$ View Source

C. Optimal Scale Selection Stage

Information entropy is often used as a quantitative indicator of system information content [48]. Thus, it can be further utilized as a criterion for the optimization of system equations or parameter selection. In a relatively simple background, the highlighted target can change the information entropy of the whole image. By contrast, for a small and dim infrared target, its contribution to the whole image information entropy is insignificant. In the proper sclae map, the areas of interest are highlighted while the other parts are suppressed to the greatest extent. For the saliency detection of large targets, the minimum image information entropy may perform well in selecting the optimal saliency map, but it is not suitable for infrared targets with extremely small sizes.

Small targets can considerably influence the value of information entropy in the local saliency region [49]. Information entropy is a local concept, and for one pixel point in the image, information entropy $H(x$ , $y$ ) is defined as follows:

$\begin{equation*} \mathrm {H}\left ({x,y }\right)=H\left [{ \mathrm {\Lambda }\left ({x,y }\right) }\right]=-\sum \nolimits _{b=\mathrm {1}}^{K} {p_{b}(x,y\mathrm {)lg}p_{b}(x,y)},\tag{19}\end{equation*}$ View Source

where

$\Lambda (x$

$y$

) represents a local area adjacent to a pixel point (

$x,y$

), the pixel value of the local area is projected onto

$K$

intervals, and

$p_{b}(x$

$y$

) represents the probability that the pixel value is in the

$b$

interval. A high value of the local information entropy means that the region is rich in information and has a high probability of containing small targets.

In the optimal scale map, the target saliency is better than the background clutters, and at the same time, the background often shows a certain spatial similarity. Hence, when selecting the salient region, we first traverse the largest pixel point $L_{k}$ in all scale maps:

$\begin{equation*} L_{k}=\max \left ({S_{k}\left ({x,y }\right) }\right)\quad k=\mathrm {1\ldots }K,\tag{20}\end{equation*}$ View Source

where

$K$

is set as 16 in the original image with a size of

$288\times 384$

. With the point as the center, we define the eight neighbor points as

$B_{k}$

and calculate the mean value of the pixel value

$m_{k}$

in its eight neighborhoods as follows:

$\begin{equation*} m_{k}=\frac {1}{8}\sum \limits _{n=1}^{8} {I_{n}\left ({i,j }\right)} \quad I_{n}\left ({i,j }\right)\in B_{k}\tag{21}\end{equation*}$

View Source

Figure 3 shows the values of maximum pixel points and their neighborhoods at two scales.

FIGURE 3.

Comparison of pixel values of target and interference at two scales.

Show All

The similarity of the pixel point is determined by its adjacent region [50]; thus, the pixel values of the eight neighborhoods near the strong background interference are similar to one another or tend to approach the largest pixel value, whereas only a few neighborhood pixel values in the target region tend to approach the central pixel value. Extensive experiments indicate that the mean value of the eight neighborhoods’ pixel values in the target area is usually smaller than that in the highlighted background clutters. We set the discriminant standard $\tau$ . When the mean value is less than $\tau$ , the scale corresponding to the region is temporarily stored, and when the mean value is greater than $\tau$ , the corresponding map is regarded as “not salient”. After traversing all maps, we calculate the local information entropy in the stored scale maps and regard the scale corresponding to the maximum local information entropy as the optimal scale $k_{out}$ , the $k_{out}$ is defined as follows:

$\begin{equation*} k_{out}=\mathop {{\mathrm {arg}}}\limits _{k} {\mathrm {max}\{H(V_{k}\mathrm {)\}}} V_{k}=L_{k}+B_{k}.\tag{22}\end{equation*}$ View Source

where

$L_{k}$

represents the largest pixel point,

$B_{k}$

represents the eight neighbor points.

Figure 4(a₂, b₂) and Figure 4(a₃, b₃) respectively show the optimal scale saliency map screened by the minimum global information entropy in accordance with the original SSS algorithm and by the proposed local information entropy. As shown in Figure 4(a), the saliency map obtained by the global information entropy focuses on the car at close range, whereas the proposed selective mechanism focuses on the driver, which is the small target in the original image. Similarly, from Figure 4(b), the saliency map obtained by the global information entropy method focuses on the high-brightness buildings at close distance, and the local information entropy method can effectively detect the small infrared targets at far distances.

FIGURE 4.

a₁, b₁ is the original image. a₂, b₂ are the salient maps obtained by global entropy, a₃, b₃ are the salient maps obtained by local entropy.

Show All

SECTION IV.

Experimental Results and Analysis

In this section, we randomly select twelve infrared small target images from seven infrared image sequences to verify the effectiveness and robustness of the proposed algorithm. These sequences contain images with different sizes in various typical scenarios. The detailed information of seven sequences is shown in Table 1. Figure 5 shows the experimental results of the eight state-of-the-art contrast methods and the proposed method for infrared images in each scene. Table 2 and Figure 6 show the results of SCRg, BSF and ROC curves of eleven contrast methods. The eleven contrast algorithms include TDLMS [51], Top-hat [52], RPCA [19], SSS [29], WNNM-MC [22], TDGS [33], NWIE [35], CF [34], RLCM [32], DM filter [30], IPI [20]. Based on detection results in Figure 5, SSS has achieved satisfactory detection results in relatively simple background, but SSS misses the targets with low SNR such as in Image e and Image f. RLCM and CF fail to fully detect multiple small targets in Image g, k and l; instead, a lot of false alarm emerge. Results of TDGS in Image f, i, j are not capable of detecting target correctly, due to its failure to enhance target with low SNR in complicated background with large noise. Detection results of IPI fail to suppress the heavy cloud edges and highlighted buildings completely in Image b, c, g. DM filter obtains ideal detection results in Image a, d, i, k, all the targets are correctly output while most background clutters are discarded. NWIE has achieved relatively ideal detection results compared to other contrast algorithms, and complicated background clutters are almost suppressed and most small targets are highlighted. However, false alarms still emerge in multitude in detection results of NWIE in Image c and Image e, owing to the bright cloud and building. According to 2D and 3D results in Figure 5, the proposed method can both eliminate the complicated background and enhance the target region of interest in all the seven sequences.

TABLE 1 Details of the Seven Sequences

TABLE 2 SCRg and BSF of Different Methods

FIGURE 5.

The representative images of the seven real image sequences and the corresponding processed results of different methods.

Show All

FIGURE 6.

ROC curves of different methods.

Show All

To further objectively verify the detection performances of the proposed method, we adopt four kinds of evaluation metrics: the signal-to-clutter ratio gain (SCRg), the background suppression factor (BSF), Receiver Operating Characteristic curve (ROC curve) and time. The SCRg and BSF values for different methods are shown in Table 2. The maximum and second values of SCRg and BSF obtained by the detection results of different algorithms in each image are marked as red and blue, respectively.

$\begin{align*} SCR=&\frac {\vert \mu _{t}-\mu \vert }{\sigma },\tag{23}\\ SCRg=&20lg\frac {SCR_{out}}{SCR_{in}},\tag{24}\\ BSF=&20lg\frac {\sigma _{in}}{\sigma _{out}}.\tag{25}\end{align*}$ View Source

where

$\mu _{t}$

is the average of target area intensity,

$\mu$

and

$\sigma$

are the average and standard deviation of the entire image intensity. Both the SCRg and BSF indicate the degree of accuracy in infrared small target detection. The larger the SCRg and BSF values, the better the performance of the related algorithm in background suppression and target extraction. From Table 2, the SCRg and BSF values of the proposed method are higher than those of other contrast algorithms in most cases. In simple background such as Image a, the SCRg and BSF values of each contrast algorithm are relatively high. The SCRg value of IPI is slightly higher than that of the proposed algorithm, but the background suppression value is lower. The values of SCRg and BSF of NWIE in the detection results of twelve infrared images are higher than other contrast algorithms, but on the whole, they are slightly inferior to the proposed algorithm. According to the values of SCRg and BSF in Table 2, the background suppression ability and target extraction ability of the proposed algorithm are superior to other contrast algorithms in general. To conclude, the intuitive evaluation conclusion can be drawn that the proposed method outperforms other state-of-the-art contrast algorithms, especially on infrared images with thick clouds and high-brightness buildings. The average time consumed by each algorithm is given in Table 3. It can be concluded from Table 3 that both RLCM and WNNM algorithms consume less time, followed by the proposed algorithm, TDGS and NWIE. Similar to the prposoed method, DM filter and CF perform multi-scale calculations, but take twice as long as the proposed method. IPI splits an infrared image into the patch-image model, which requires the longest running time, and the larger the image size, the longer the consumption time. In addition, the receiver operating characteristic (ROC) curves are also utilized to further evaluate the performance of the proposed method. The ROC curves effectively reflect the relationship between probability of detection and the false alarm rate. In ROC curves image, the larger the area between the curve and the x-axis, the higher the detection efficiency of the related algorithm. Figure 6 shows the ROC curves of different methods for twelve infrared scenarios. The ROC detection result by WNNM in Image e performs better than other methods. But in other images, the proposed algorithm has achieved the highest detection rate at the same false alarm rate. The comparisons derived from the ROC curves indicate that the proposed method is effective and robust to detect infrared small targets against various complicated backgrounds.

TABLE 3 The Average Time Consumed by Each Algorithm

In addition, we conduct a large number of experiments to prove the necessity of introducing IALM method as the preprocessing step. In the following, we provide the simple experimental results to illustrate it. Figure 7 shows the detection results of Image b, Image c and Image e when the two algorithms’ execution order is altered. The detection results show a high false alarm rate and poor location precision.

FIGURE 7.

Experimental results before and after IALM and ISSS algorithm execution order changes.

Show All

In above experiments, the ISSS algorithm not only enhances the target intensity but also sharpens the background edges, resulting in a large amount of residual background clutters. However, in the subsequent processing stage, ISSS algorithm can enhance the target strength and further eliminate the residual background clutters. In conclusion, the IALM and ISSS algorithms are the irreplaceable steps for the proposed method, and their execution order cannot be exchanged.

SECTION V.

Conclusion

This paper proposes an improved SSS (ISSS) algorithm via a precise feature matching and scale selection strategy for efficient infrared small target detection. We have selected twelve special infrared images with complex background such as heavy and bright cloud, cloud and building, bright building with rich details, heavy noise and bright background, etc. as application scenarios, and the eleven state-of-the-art methods as contrast algorithms to evaluate the performance of the proposed algorithm. Extensive experiments illustrate that the proposed method outperforms the contrast algorithms not only in visual quality, but also in quantitative evaluation criteria such as SCRg, BSF scores, ROC curves, and running times, especially in scenarios with thick clouds and high-brightness buildings. Therefore, the proposed algorithm is an effective and robust method for infrared small target detection, and it has great potential in the field of infrared small target detection and tracking. In our future work, we will further explore the structural information of different multiscale saliency maps, such as the extraction of salient areas in each scale map by the fusion method.

References is not available for this document.

MIT Libraries

MIT Libraries

Multi-Scale Infrared Small Target Detection Method via Precise Feature Matching and Scale Selection Strategy

Abstract:

Metadata

Abstract:

Funding Agency:

Introduction

Related Work

A. Matrix Decomposition

B. Saliency Detection

Proposed Method

A. Background Suppression Stage

Algorithm 1 (Matrix Decomposition by IALM Algorithm)

B. Feature Matching Stage

C. Optimal Scale Selection Stage

Experimental Results and Analysis

Conclusion

References

IEEE Account

Purchase Details

Profile Information

Need Help?

MIT Libraries

MIT Libraries

Multi-Scale Infrared Small Target Detection Method via Precise Feature Matching and Scale Selection Strategy

Alerts

Abstract:

Metadata

Abstract:

Funding Agency:

Introduction

Related Work

A. Matrix Decomposition

B. Saliency Detection

Proposed Method

A. Background Suppression Stage

Algorithm 1 (Matrix Decomposition by IALM Algorithm)

B. Feature Matching Stage

C. Optimal Scale Selection Stage

Experimental Results and Analysis

Conclusion

References