Introduction
With the continuous development of Unmanned Aerial Vehicle (UAV) technology, remote sensing images are widely used in military defense, disaster emergency responses and ecological environment monitoring. UAV remote sensing has high scientific value in the refinement of regional information. It has the characteristics of a high spatial resolution, a high frequency, and high cost performance [1]. High-resolution remote sensing image processing has emerged as the times have required. High-resolution remote sensing technology has a great impact on the atmospheric environment, especially on the imaging spectrum in the visible light range. In recent years, due to the existence of haze, images taken in outdoor scenes often suffer from poor visibility, reduced contrast, blurred image quality and color offset [2]. Although processing technology for haze images has been developed for both image enhancement and image restoration, the processing technology for haze images is insufficient. Dehazing technology is a kind of computer vision technology applied in haze weather. It can effectively avoid the degradation of the image quality caused by haze weather and provide better data results for image processing, image analysis and image understanding.
At this stage, the haze removal technology mainly includes three methods, and it has achieved some results:
Image enhancement-based haze removal method: This method uses mature image processing algorithms to enhance haze images in order to uncover the features and achieve the useful value of highlighting the target in the image. The disadvantage of this method is that it will inevitably result in information loss in other parts of the image after highlighting the target features, thus distorting the processed image.
Haze removal method based on a physical model [3]: This method restores image based on the atmospheric scattering model or an improved method of the atmospheric scattering model, solves the inverse process of image degradation using a mathematical method, and finally achieves the goal of restoring a clear image [4]. Representative works along this line of research include [5]–[12]. Tan [5] proposes a local contrast maximization dehazing algorithm, which maximizes the local contrast by observing the contrast difference between the clear image and the haze image to achieve the effect of dehazing. Fattal [6] by analyzing the image reflectivity, concludes that the projection image is not locally related to the surface shadow, and through this conclusion, dehazing is realized. He et al. [7] introduces an algorithm based on the dark channel prior, which proposes that there is at least one color channel with the lowest intensity pixels with haze. Berman et al. [8] proposed dividing the color in the image RGB space into clusters and linearly restored the clear image using a prior formula with the help of a line of each color cluster in the RGB space in the haze image. Jiang et al. [9] based on the dark channel prior algorithm, proposed a novel adaptive dual channel prior image dehazing method, which combines the dark channel prior and the bright channel prior. Ju et al. [10] proposed an improved atmospheric scattering model (IASM) in which the transmittance map is directly estimated by linear operations of the brightness, saturation and gradient; and the atmospheric light and scene incident light can be accurately estimated by combining sky related features and the guidance model (GEM). Shu et al. [11] proposed a hybrid regularized variational framework to simultaneously estimate depth map and haze-free image. In particular, introduced the second-order total generalized variation (TGV) regularizer to constrain the estimation of depth map. Zhu et al. [12] proposed a prior algorithm of color attenuation, which is a supervised learning algorithm for image denoising. Liu et al. [13] proposed a unified second-order variational framework to refine the depth map and restore the haze-free image. The second-order framework can preserving important structures in both depth map and haze-free image. Shu et al. [14] proposed a hybrid variational model with promoted regularization terms to refining transmission map, and then using an alternating direction algorithm to obtained final haze-free image.
Learning-based methods [15]. In recent years, with the rapid development of depth learning, methods to restore haze images using a convolution neural network based on deep learning have continuously emerged. Representative works along this line of research include [16]–[21]. Tang et al. [16] used a random forest regression to learn the correlation between features and transmittance by collecting multiscale features of images, such as dark channels and the local maximum contrast. Cai et al. [17] proposed the DeHazeNet dehazing network, obtained the weights needed by the network by training, and then estimated the transmission rate of haze images using the forward propagation of the network. Ren et al. [18] used a multiscale neural network, and learned the mapping relationship between a haze image and the transmittance. The results showed that the algorithm has a good effect on synthesizing haze images and real images. Li et al. [19] proposed the AOD-Net network, which clarifies haze images by building an end-to-end model. The idea of this network is to reduce the accumulation error caused by estimating the parameters of the physical model many times. Ren et al. [20] proposed an algorithm that hinges on an end-to-end trainable neural network that consists of an encoder and a decoder. The encoder is exploited to capture the context of the derived input images while the decoder is employed to estimate the contribution of each input to the final dehazed result using the learned representations attributed to the encoder. Liu et al. [21] consists of three modules: preprocessing, the backbone, and postprocessing. The trainable preprocessing module can generate learned inputs with better diversity and more pertinent features. The backbone module implements a novel attention-based multiscale estimation on a grid network. The postprocessing module helps to reduce the artifacts in the final output.
In summary, learning-based methods have gradually become the mainstream image processing methods in recent years, and their effects are significant. This article improves the classical atmospheric scattering model for haze images using actual high-resolution remote sensing images from an unmanned aerial vehicle. Several parameters in the model are unified into an input-related variable
Dehazing Algorithm Based on an Atmospheric Scattering Model
Based on the causes of light scattering and the combined effect of haze on light scattering, as shown in Figure 1, the Nayar and Narasimhan [22] and [22] and McCartney [23] believe that the images taken are mainly affected by two reasons.
(1) The reflected light of the target is absorbed and scattered by the suspended particles in the medium during the transmission process, which results in energy attenuation. This usually reduces the brightness of the image and the contrast of the image.
(2) Ambient light such as sunlight and other objects’ reflected light is affected by the particles in the medium to form stray light. Stray light is formed by light scattering, which usually makes the captured image blurry, resulting in the image color not being natural. The captured image consists of two parts: one is the reflected light of the attenuated target caused by atmospheric scattering and absorption, and the other is the atmospheric light caused by atmospheric scattering.
The cause of haze formation in an image is represented using the atmospheric scattering model [23], [24], which can be written as follows:\begin{equation*} I(x)=J(x)t(x)+A(1-t(x))\tag{1}\end{equation*}
Here, \begin{equation*} t(x)=e^{-\beta d(x)}\tag{2}\end{equation*}
\begin{equation*} J(x)=\frac {1}{t(x)}I(x)-A\frac {1}{t(x)}+A\tag{3}\end{equation*}
Most of the existing algorithms follow the following steps to restore a hazy image into a clear image: (1) estimate the transmission matrix
End to End Demisting Algorithm Based on Deep Learning
A. Deficiency and Improvement of the Atmospheric Scattering Model
Through observation, it can be found that the noise caused by haze in an image is actually uneven and is related to the attenuation of the scene caused by the haze and the physical distance to the camera surface. If a uniform atmospheric scattering model is used for dehazing, in this case, all the pixels in the image go through the same parameter solving process, and the haze content of the target image is different, resulting in the partial distortion or insufficient dehazing of the image after dehazing. This shows that the dehazing process needs to be changed according to the input image, and the recovery model must also be suitable. As mentioned above, in the two independent steps of the estimation transfer matrix and atmospheric light, it is impossible to completely simulate the process of inverse dehazing of the atmospheric scattering model, which will inevitably lead to parameter estimation errors. Then, the value after the completed parameter estimation will be incorporated in the atmospheric scattering model, and the error will accumulate and may be mutually amplified. Therefore, there are some limitations to the algorithms that use priors or hypotheses.
To avoid the error caused by the independent estimation of parameters \begin{equation*} J(x)=K(x)I(x)-K(x)+b\tag{4}\end{equation*}
Here, we also use the following:\begin{equation*} K(x)=\frac {\frac {1}{t(x)}(I(x)-A)+(A-b)}{I(x)-1}\tag{5}\end{equation*}
In formula (5),
B. End to End Deep Learning Network Model Design
To allow the end-to-end model mentioned in the previous section to estimate variable
Through the analysis of high-resolution images, we can find that the high-resolution images of UAV aerial photography are rich in detailed information. There are not only large areas (such as farmland, water surface, land, etc.) in the image but also a large number of detailed areas (such as residential areas, road traffic networks, ground with textural characteristics, etc.), which result in the ability to use deep learning to extract feature information. We should not only consider extracting low-frequency feature information but also consider extracting high-frequency feature information. To solve this problem, this article proposes a multiscale convolution neural network based on deep learning. The multiscale neural network consists of two parts: one is the coarse-scale neural network for low-frequency information extraction, and the other is the fine-scale neural network for high-frequency information extraction. Through the use of different scale networks for the feature extraction of an input image, the mapping relationship between variable
Figure 3 shows the structure of the multiscale neural network established by the algorithm. First, the rough structure of the
The network structure of the algorithm of this article mainly includes three parts: convolution layers, pooling layers and upsampling layers.
Convolution layers: The convolution operation is used to extract the local features of the image to filter out the redundant components in the image and get the image features. In the coarse-scale network, it is mainly used to extract the low-frequency feature information of the image. The low-frequency feature of a haze image is mainly a large area or large areas with the same color in the image. Therefore, in the coarse-scale network, three large convolution kernels,
The convolution formula of each layer mentioned in this article is as follows:\begin{equation*} f_{n}^{l+1} =\sigma \left({\sum \limits _{m} {(w_{n,m}^{l+1} \ast f_{m}^{l})+b_{n}^{l+1}} }\right)\tag{6}\end{equation*}
\begin{equation*} y_{ji} =\begin{cases} {x_{ji}} & {if~x_{ji} \ge 0} \\ {a_{ji} x_{ji}} & {if~x_{ji} < 0} \\ \end{cases}\tag{7}\end{equation*}
Pooling layers: Since the structure of a high-resolution remote sensing image is complex and the content of an image has a high number of pixels, the image contains much feature information after the convolution layers. To further reduce the dimension of the feature image and retain the features, a
Upsampling layers: After passing through a large pooling layer, the size of an image feature map is reduced. To ensure that the size of the feature map and the input image is the same, upsampling layers are added after passing through the maximum pooling layers of the upper layer [18].
C. Algorithm Steps
The end-to-end UAV high-resolution remote sensing image dehazing algorithm based on deep learning proposed in this article is shown in Figure 5. The specific steps are as follows:
The data set is used to train the deep learning neural network, and the mapping relationship between the feature graph variable
and the input image is obtained;K(x) Input the haze image into the network model, and get the corresponding characteristic graph variable
; andK(x) The original haze image and the characteristic image variable
are introduced into the improved atmospheric scattering model, and the haze-free image is obtained.K(x)
End to end UAV high-resolution remote sensing image dehazing algorithm based on convolution neural network.
Experimental Results and Analysis
A. Data Set Design and Training Implementation
The diversity of data sets is an important condition to determine the results of network training. At present, there is no unified standardized data set to use. Through the analysis of the target haze pictures, it is found that the biggest difference between the high-resolution pictures taken by a UAV and the training data set used by the existing algorithms is that there is no sky area, which is a requirement for the selection of data sets. As shown in Figure 6 (a), we first select 27256 indoor images from the NYU2 depth database [25] as the main training set of this network, and the images in the NYU2 depth database include indoor haze free images and their combined haze images, which allow us to better train the network. In addition, as shown in Figure 6 (b), 1100 high-resolution haze images taken by a UAV over the Panjin red beach wetland of Liaoning Province in 2018 are also used, and 1000 of them are selected as the original self-built training set of this algorithm. Since all the pictures are original pictures with haze, the training flow of the algorithm in this article is used, as shown in Figure 7. First, we use the NYU2 depth database to conduct preliminary training on the network, then we conduct the dehazing process on all the images in the original self-built training set, and finally we synthesize the data sets with different concentrations of haze through formula (1). In this article, the atmospheric light value a is set between [0.6, 1.0], and
During training, the initialization parameters use Gaussian random variables. The momentum of the back propagation of the neural network is set as 0.9 and the attenuation coefficient is set as 0.0001. The initial learning rate is set to 0.001. After training using the NYU2 deep database, the set learning rate is reduced by half so that the network can train using the subsequent self-built training set more effectively. In this article, the mean squared error loss function (MSE) is taken as the cost function. Then, the goal of network optimization is as follows:\begin{equation*} L(\theta)=\min \limits _{\theta } \left({\frac {1}{n}\sum \nolimits _{i=1}^{n} {\left \|{ {t-k(x_{i};\theta)} }\right \|}^{2}}\right)\tag{8}\end{equation*}
Here,
B. Experimental Comparison
In this section, we will compare this algorithm with the most classical and advanced algorithms in terms of its visual effect and objective indicators, and the algorithms that need to be trained all use the databases used in this article. The algorithms involved are as follows:
Dark Channel Prior (DCP) - He et al. [7] prior haze removal,
Multi scale Convolutional Neural Network (MSCNN) - Ren et al. [18] multi scale convolutional neural network for haze removal,
Dehazenet - Cai et al. [17] dehazing based on a CNN,
All-in-one Dehazing Network (AOD net) - Li et al. [19] single image end-to-end CNN image dehazing,
Dense Connected Pyramid Dehazing Network (DCPDN) - Zhang and Patel [29] dense linked pyramid dehazing,
Gated Fusion Network(GFN) -Ren et al. [20] Gated fusion network for single image dehazing, and
GridDehazeNet – Liu et al. [21] Griddehazenet: Attention-based multiscale network for image dehazing.
The algorithm in this article uses a deep learning framework named Pytorch, and the training and testing of the network are completed in the Pytorch environment. The hardware environment is an Intel Core i7-8750 h CPU and an NVIDIA GeForce GTX 1050Ti graphics card.
1) Comparison of the Visual Effects on Synthetic Datasets
This article first conducts testing using the synthetic dataset and compares the results with those of the seven algorithms mentioned above. The DCP is a prior-based method while the other methods are based on data training. For a fair comparison, the above data-based training data are the same as the training data used in this article. In this article, both the peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) are used to quantitatively evaluate the dehazed images, which are explained in detail in the chapter comparing the objective indicators. From the image shown in Figure 8 (b), it can be seen that the result of the DCP is darker than that of the real image and the other dehazed images, and the DCP can cause severe color distortion. For the MSCNN and DehazeNet in Figures 8 (c)-(d), respectively, the visible haze is still not effectively removed, and the output image color is whiter due to the residual haze. The AOD-Net and DCPND in Figures 8 (e)-(f), respectively, overcome the effect of color distortion to a large extent. The output image color is closer to the real object, but the algorithm can easily generate halos and artifacts around the object, and the blurred parts in the original image are not significantly removed. The GFN and GridDehazeNet are the haze removal algorithms proposed in the last two years. The advantages of these two algorithms are that they successfully suppress artifacts and halos to a certain extent, are visually closer to haze-free images and have better haze removal effects for scenes close to the lens. However, haze can still be clearly seen in scenes with deeper depths, that is, scenes far from the lens. Compared with the existing technology, this article’s proposed method largely compensates for the disadvantage of different degrees of distant and near-range haze removal. This is because the convolution neural network combined with the coarse and fine scales selected in this article can learn the haze images of different scales separately. The coarse scale network is responsible for the lower-resolution part of the image, and the fine-scale network is responsible for the higher-resolution part of the image. During the recovery process, the images with different depths of field are more specifically dehazed, which shows better robustness and a better visual effect.
2) Qualitative Comparisons on Real-World Images
To further evaluate the proposed method and to compare the similarities and differences between our algorithm and the other algorithms on unmanned remote sensing image fog removal, the unmanned remote sensing image of Panjin Red Beach dataset in Liaoning Province was used as the test image. Figure 9 shows a comparison of the algorithms. On the synthetic datasets, the DCP can cause color distortion, which is particularly evident in the first row of images in Figure 9 (b). This is mainly because the DCP dehazing algorithm is based on prior knowledge, but the inherent disadvantage is that it relies too much on prior knowledge and ignores the information of the image itself, which results in the dehazing effect being greatly reduced. A similar problem in the MSCNN, DehazeNet and DCPND is that the image after haze removal still has visible haze, which is more evident in the MSCNN, especially in the first row of Figure 9 (c); furthermore, the MSCNN also has color distortion. As we can see from the last three rows of the image in Figure 9 (c), the image after haze removal is significantly different from the original image. The result of the DCPDN algorithm is shown in Figure 9 (f). The white area is overexposed, which makes the color of the building area brighter. The AOD-Net method has the disadvantage of overenhancement, that is, the image itself has a good haze removal effect and the details of haze removal are clear, but the whole image is black. The image in Figure 9 (e) is visually much darker than those of the other algorithms. The GFN and GridDehazeNet algorithms show prominent visual effects after haze removal, which conforms to people’s intuitive perception of haze-free images. However, for the high-resolution remote sensing images from an unmanned aerial vehicle, the biggest feature is the large resolution, especially in farmland and beach areas. The image has rich textural details, more edge information, and a large color change in a single image. After enlarging the dehazed image with the GFN and GridDehazeNet algorithms, you can see that many textural details are blurred, many areas are blurred, and the color is darker. As a result, the dark areas in the original image are much darker, such as the fourth line images in Figures 9 (g) - (f), which appear black after dehazing. In comparison, the algorithm can clearly see the textural details of the red area on the right half through the fourth line of the picture in Figure 9 (i). Compared with the above methods, our method has a clear overall haze removal effect, retains the maximum amount of scene details, and is more effective at fog removal.
To further evaluate the proposed method, we also use part of the real image data set from Ren et al. [18] and Fattal [30] to compare the seven algorithms mentioned above with those in this article. Figure 10 (a) is a true image with haze. Same as the Red Beach dataset, the DCP algorithm exhibits varying degrees of color distortion, which is evident in the third row of Figure (10) (b). The MSCNN, DehazeNet and DCPND have the same problem as with the Red Beach dataset, that is, there is still some haze in the images. The AOD-Net algorithm has a darker image than other images, but it generally has a good haze removal effect. The GFN and GridDehazeNet algorithms have little difference in their haze removal effects, but there is some residual haze on the edges of the image. By contrast, our method has clear haze removal. The image details are enhanced appropriately after haze removal to maximize the restoration of the scene’s textural details.
3) Objective Index Comparison
Table 2 shows the average PSNR and SSIM values of dehazed results on the synthetic dataset. It can be seen that the method proposed in this article is much better than the comparison algorithms listed in the paper.
The pictures taken by the UAV are all haze images without clear haze images. The SSIM and PNSR are commonly used to evaluate images that have original clear images. However, in order to compare the dehazing effect of the algorithms more objectively, the MG (mean gradient), ES (edge strength), IE (information entropy) and VAR (variance) are selected in this article and used to analyze the images quantitatively. Table 3 shows the comparison of the objective indicators. The mean gradient of an image refers to the change rate of the gray level of an image, which reflects the change rate of the contrast of the small details of an image. The change rate can be used to express the sharpness of an image. The larger the value is, the richer the details are, and the clearer the texture. The edge strength represents the magnitude of the gradient of an image’s edge points, which is similar to the average gradient. The larger the value is, the clearer the details are expressed. The image information entropy is the value that represents the whole amount of image information. The larger the value is, the greater the amount of image information quantity, the better the image quality, and the richer the texture. The variance is used to reflect the image color and contrast. The larger the value is, the more prominent the color performance of the image. Through the above four indexes, the above 7 UAV images are analyzed to get the average values. Table 2 shows the parameter indexes of each algorithm after dehazing. It can be seen that the algorithm in this article improves the performance with respect to the mean gradient, edge strength, information entropy and variance data, which shows that the image processed by the algorithm in this article has higher definition, rich detail information, and higher color saturation; therefore, a good dehazing effect is obtained.
In addition, as shown in Table 4, this article also measures the code running time for the 7 test images mentioned above and takes the weighted average value as the average time of a single image dehazing process. It can be seen that the algorithm in this article has both a good processing effect and shorter running time than the other algorithms, ensuring the image dehazing efficiency.
Analysis and Discussions
A. Effectiveness of the Multiscale Neural Network
In this section, we analyze how the multiscale neural network proposed in this article can make the image dehazing better. The combination of the coarse-scale network and fine-scale network can effectively improve the image dehazing effect, and the output of the coarse-scale network can provide additional information for the fine-scale network. Figures 10 (b) and (c) show the images containing only the coarse-scale network and the multiscale network
Effectiveness of the multiscale neural network. (a) is the original image, (b) is the
B. Effectiveness of the Unified Variable K(x)
To illustrate the effectiveness of the
Effectiveness of the unified variable k(x). (a) is original image, (b) is dehazed image of the MSCNN algorithm, and (c) is the dehazed image of our algorithm.
Conclusion
This article proposes end-to-end dehazing technology based on deep learning, which is inspired by the extensive application of deep learning in the field of image processing. Using the multiscale convolution neural network, it can remove haze more effectively, and the introduction of residual learning can reduce the computational load and speed up learning. This article analyzes the differences between the existing dehazing algorithms and the algorithm in this article by comparing their visual effects and objective indexes. The experiment shows that the algorithm in this article has a good effect in all aspects and can meet the requirements of UAV high-resolution remote sensing image dehazing. However, for an image with an uneven haze distribution, the processing speed and processing effect can be further improved. At present, the requirement of real-time processing for UAV image is gradually improved. Our future work will focus on improving the real-time performance of UAV image dehazing, accelerating the processing speed of the algorithm and improving the processing efficiency.