Loading web-font TeX/Math/Italic
Offshore Petroleum Leaking Source Detection Method From Remote Sensing Data via Deep Reinforcement Learning With Knowledge Transfer | IEEE Journals & Magazine | IEEE Xplore

Offshore Petroleum Leaking Source Detection Method From Remote Sensing Data via Deep Reinforcement Learning With Knowledge Transfer


Abstract:

A marine oil spill is an environmental pollution incident that generally has the attributes of a high speed, widespread, and long duration. It seriously threatens the mar...Show More

Abstract:

A marine oil spill is an environmental pollution incident that generally has the attributes of a high speed, widespread, and long duration. It seriously threatens the marine ecological environment and related industries. It is vital to determine the source of the oil leakage so that it may be stopped and related hazards can be reduced. Oil spill accidents in the sea are generally located in offshore and navigation channels. With the rapid development of remote-sensing techniques, oil leak extraction using remote-sensing data has played an essential role in oil spill research. This article proposes a Monte Carlo-based deep Q-transfer-learning network (DQTN) offshore oil leak detection method that uses remote-sensing data. Remote-sensing data are utilized to continuously monitor a marine oil spill on the surface. The estuarine and coastal ocean model is utilized to simulate a marine oil spill event. The deep-Q-network method with offline transferred knowledge is then utilized to determine the marine oil spill source location. In an experiment, based on the Bohai oil spill incident on June 2, 2011, the effectiveness of the remote-sensing-based DQTN marine oil spill search algorithm is verified. The accuracy of the targeted oil spill point is up to 98.97%.
Page(s): 5826 - 5840
Date of Publication: 18 July 2022

ISSN Information:

Funding Agency:


SECTION I.

Introduction

A marine oil leak, which is generally caused by a severe shipwreck or drilling incident at an oil production platform, is highly destructive to the ecological environment. Once a leak accident happens, petroleum rapidly diffuses and slicks on the marine surface. The slicked oil film not only evaporates and puts toxic chemicals, such as hydrocarbons, into the air but also penetrates marine life [1]. The natural evaporation and biodegradation of the leaked oil are lengthy processes that are sensitive to the ambient temperature, offshore terrain, etc. Therefore, detecting the leak source quickly to reduce the amount of leaked oil rather than relying solely on cleanup measures after a larger volume has contaminated the ocean is a significant issue in marine oil leak hazard research. Remote-sensing techniques, with characteristics of high-resolution large-scale observations and steady patrol time intervals, can effectively tackle oil leaking hazards. It can observe the widely influenced region with short time and facilitate to detect the oil leak source [2].

An oil leak accident and corresponding detection process are shown in Fig. 1. Crude oil transmission pipelines or reservoirs can become damaged during the exploitation or transmission of offshore oil, which can cause crude oil to leak on the ocean floor. After leaking, petroleum particles will gradually rise, float, and accumulate. Oil particles can diffuse and drift under the force of waves and winds in a short time. When the crude oil emerges into a vast area of oil film on the sea surface, remote-sensing starts to exert its advantage in wild region monitoring. Remote-sensing continuously monitors the offshore oil spill area and captures the real-time oil film status. A large amount of remote-sensing data is then transmitted from satellites to the data center for storage, and this data can facilitate the leaking source detection.

Fig. 1. - Marine oil leak detection digital twin system framework.
Fig. 1.

Marine oil leak detection digital twin system framework.

With the rapid development of remote-sensing techniques, oil leak extraction using remote-sensing data has played an important role in oil spill research. Synthetic aperture radar (SAR) is widely utilized in remote sensing to collect data for oil detection. Singha et al. [3] proposed two artificial neural networks for the oil spill detection problem by image segmentation, feature extraction, and classification on SAR data. David et al. [4] utilized an automatic oil spill detection system that adaptively selected artificial neural networks or decision trees to classify SAR data. With the contributions of these researches, the oil film detection accuracy on remote sensing data has achieved significant progress. Under these circumstances, applying the promising technology to timely decision-making support response is urgent [5].

Numerical simulation models have provided general and practical measurements in the last decade while reproducing the oil leak process [6]. Targeting to tackle the oil spill trajectory forecasting, time estimating, and state assessment for arriving at specific areas of interest, the Eulerian method, and the Lagrangian method are the two mainstream solutions of the oil spill method. The Eulerian method involves the mass and momentum conservation equations or diffusion equations to simulate the continuum particle phase. Sarhadi Zadeh et al. [7] proposed a two-dimension hydrodynamic Eulerian model based on the Reynolds-averaged Navier–Stokes (RANS) equations. Ivorra et al. [8] designed second-order schemes for advection to reduce the numerical diffusion problem in the Eulerian model. The Lagrangian method, different from the Eulerian method, simulates the pathway of each particle as a discrete phase. Wang et al. [9] proposed a Lagrangian discrete particle algorithm based on the Princeton ocean model (POM) for oil spill transport simulation. Zelenke et al. [10] proposed the general NOAA operational modeling environment (GNOME), which applied a Lagrangian transport or trajectory model, to mitigate or avoid future damage to valuable natural resources caused by marine pollution. High temporal and spatial resolution marine data, including wind field, salinity, temperature, etc., are fed into these state-of-the-art approaches to simulate oil particle movement in the marine leak region. The simulation accuracy, to a great extent, relies on precise input data. However, before embracing a marine remote-sensing technique, marine data were generally acquired by the meteorological station and the tidal station [9], [11]. The number and location of stations limit the resolution of data. Though different spatial and temporal interpolation methods are devoted to exploring the pattern in historical data [12], [13], traditional numerical models are still delicate when this is the case. A credible simulation model based on limited historical data has become the research focus for numerical simulation models. Furthermore, numerical simulation methods suffer a large computation overhead and generally take hours to days to accomplish a simulation task [9], [14]. Balancing the efficiency and the accuracy by utilizing the numerical simulation method in the oil leak source detection guarantees the hazard in time decision-making feedback. A spill source detection method with high efficiency and high accuracy is urgently needed for oil spill accident pollution prevention.

In order to address the challenges mentioned above, an oil leak digital twin framework is designed to reproduce the practical accident. The digital twin is the seamless connector that integrates the physical and cyber space [15]. A variety of data is gathered and exchanged as the pipeline to empower the virtual twin to characterize the physical phenomenon. The virtual twin reflects the properties of physical accident and, furthermore, with the help of the optimization method, provides decision-making suggestions for the accident. Under the oil leak digital twin framework, a Monte Carlo-based deep Q-transfer-learning network (DQTN) for offshore oil leak detection (OLD) approach is proposed in this article. In this approach, the remote-sensing technique is utilized for oil leak region segmentation on an advanced synthetic aperture radar (ASAR). The lookup table technique is designed as an oil leak accident dictionary. Through querying the lookup table, the source detection range is settled in a small region. Then, a Monte Carlo-based estuarine and coastal ocean model (ECOM) oil simulation is developed to replicate the oil spill process. After that, the DQTN guides the detection of the oil spill source location. DQN is sufficiently trained in designed maze cases. Knowledge is inductively transferred to tackle the oil leak source detection task. The uncertainty in simulation is considered by the Monte Carlo method. The efficiency and accuracy are guaranteed by the lookup table and the DQTN. The contributions of the article are summarized as follows.

  1. A DQTN OLD digital twin approach is proposed to locate an oil leak source effectively. Its high efficiency can facilitate understanding of oil leak-induced marine pollution.

  2. The Monte Carlo technique is utilized to address the uncertainty in the ECOM parameters. The limited historical numerical model inputs are enriched by the statistical method.

  3. A maze case is designed to pretrain the DQN, and the trained knowledge is transferred to the oil leak detection task, which significantly increases the efficiency of the proposed method.

  4. The proposed method was verified by a real oil leak case from the Bohai Sea, China. The average accuracy was up to 97.54% and only took 184.5 min for the lookup table-based DQTN method to detect the oil leak source.

The rest of this article is organized as follows. Section II presents the related works about numerical simulation. Section III presents the proposed Monte Carlo-based oil leak source detection method in detail. In Section IV, the experiments conducted to evaluate the effectiveness of the proposed method are described. Finally, Section V concludes this article.

SECTION II.

Related Work

Studies on marine oil leak events focus on oil film segmentation and oil diffusion simulation. In the following, state of the art on research and application of the two techniques are introduced. Section II-A addresses segmentation and extraction in remote sensing. Section II-B covers numerical simulation methods. In Section II-B, the numerical simulation model utilized in the proposed method is elaborated in detail.

A. Segmentation and Extraction in Remote Sensing

A marine petroleum leak accident is a destructive event with a wide-influenced and random location. When an accidental petroleum leak happens, timely emerging pollution prevention measures are required to avoid or reduce the damage to the marine environment. Accurate oil leak monitoring can provide valuable information for leak source detection. Traditional oil leak detection methods are primarily based on manual monitoring. However, under certain circumstances, the oil leak region is difficult to observe due to a wide range of leaked oil regions and the profound depth of the underwater source of the leak. This makes manual monitoring ineffective. With the help of the remote-sensing technique, these obstacles to monitoring can be gradually overcome. The long-term, large-scale continuous monitoring of specific areas by remote-sensing satellites can efficiently obtain information about oil leak accidents. This information includes the oil spill area, the oil type, and the oil film thickness. Guo et al. [16] analyzed the correlation between the scattering section and the related parameters of oil spill through an electromagnetic-scattering numerical model. They also built a model of oil spills to improve the accuracy of the oil spill monitoring. Some scholars have established marine oil spill extraction methods by multiple technologies. Shi et al. [17] proposed a method of integrating satellite remote-sensing, aviation remote-sensing, shipborne sensors, and other auxiliary equipment to monitor the extent of pollution and identify oil contaminants in a constructed large-scale emergency marine oil spill extraction system. Zhang et al. [18] constructed an oil spill feature database by studying the kinds of oil spills, their shape and scattering characteristics, and the textures of SAR images. They constructed an oil spill detection approach that combined the drone SAR and UV sensors to improve the efficiency of detection and extraction. From the perspective of marine oil spill spectral characteristics, Su et al. [19] made an observation that the spectral gap between the oil spill film and the seawater is greater than the variance of the seawater by mining the relationship between the optical remote-sensing satellite band and the spectral characteristics of the marine surface oil spill film. They extracted sea surface oil spill film based on this conclusion. Su et al. [20] used an SVM to extract features from optical remote-sensing images and established a spectral pattern sea surface oil spill monitoring model to find the position of the oil spill and detect the oil spill film. Sun et al. [21] considered the confusing phenomenon of oil film and seawater and used the spectral angle-matching algorithm to detect the oil film on the sea. By increasing the number of texture features, the accuracy of oil spill recognition can be significantly improved. Zou et al. [22] defined the confidence level of the oil spill remote-sensing information extraction based on the extraction indices of the marine oil spill incidents and the segmented maps for further spill identification.

State of the arts are dedicated to precisely extracting the oil film on the remote sensing data. The oil film information can help the implementation of the chemical petroleum degradation measure at the marine. However, the degradation measure is facing the challenge of petroleum pouring into seawater continually. Remote-sensing satellites have a long revisit time and high cost in monitoring. Extracting oil film based on low amounts of remote-sensing data cannot restore the whole leak process and, consequently, provide comprehensive decision-making support. This solution will waste enormous human and material resources. The leak source detection is more urgent to decrease the total volume of the leaked oil. Without the coordinates of the accident position under the water, it is challenging to decrease the oil film diffusion from the source. The numerical methods are designed for the oil leak accident simulation. Based on a series of kinetic equations, the numerical methods temps to retrieval the accident with the help of the oil film on the remote-sensing data.

B. Numerical Simulation Methods

Effectively determining the oil spill source can help postdisaster pollution prevention. Remote-sensing data are needed to collaborate with simulations and algorithms in order to determine a marine oil spill source [23]. A classical ocean numerical simulation model can accurately simulate the spread of oil spills on the sea. The accurate and efficient extraction of oil after obtaining marine oil spill data is a vital issue. In order to achieve better accuracy and efficiency in monitoring marine oil spills, existing studies propose continuously monitoring and detecting the diffusion of oil spills by combining numerical models with remote-sensing data. Liu et al. [24] proposed a method for predicting oil slick trajectory by combining oil slick satellite data with the Lagrange orbit model. Zodiatis et al. [25] proposed a MEDSLIK oil spill model to predict the diffusion of oil by analyzing remote-sensing images [26]. Later, the MEDSLIK oil spill model was improved in the MEDSLIK-II Lagrangian model, which combined SAR and optical image data to simulate the oil slick diffusion and transformation processes [27]. Based on remote-sensing data for marine oil spill monitoring and driven by dynamic remote-sensing data, Yan et al. [28] used a back propagation neural network to find oil spills. This method has certain limitations; the identification of an oil spill's location depends on experience or related accurate news reports in the initial simulation of the source. Furthermore, this method cannot adjust and model an actual oil spill for extended time intervals. Chen et al. [29] propose an oil leak detection algorithm based on cross-entropy, which combined remote-sensing data with an ocean oil spill model and ECOM.

Compared to other oil leak numerical simulation models, the ECOM is comprehensive and efficient. The ECOM considers diverse influence factors for oil particles to build dynamical equations that guarantee the simulation's accuracy. In addition, the ECOM is rigorous regarding the input data. Without high spatial and temporal resolution in the data, the model can still effectively simulate an oil spill. The ECOM is also a relatively efficient tool compared to other widely utilized numerical models, such as FVCOM. Thus, in this research, the ECOM was chosen to simulate the oil spill process.

The ECOM is a relatively mature 3-D hydrodynamic model suitable for shallow seas developed from the marine hydrodynamic POM [30], [31]. It has requirements for initial conditions, open boundaries, and grid settings. Its application involves complete thermodynamic equations. This article uses the 3-D oil-spreading module in the ECOM to simulate the trajectory of oil spills, including various processes, such as diffusion, retention, evaporation, and emulsification. The relevant parameters of the model design include oil type, oil density, overflow location, release depth, number of oil particles, tidal composition, wind field data, etc.

In the ECOM, the oil film is composed of a large number of oil particles. The oil particles leak into seawater at a specific rate at the point of incidence of the oil spill and then spread by advection. If \vec{V_{t}} represents the movement speed of the oil drop in time t, then \begin{equation*} \vec{V_{t}}= \vec{V_{a}} + \vec{V_{s}} \tag{1} \end{equation*}

View SourceRight-click on figure for MathML and additional features.where \vec{V_{a}} represents the advection of the particles, which is their drift speed. \vec{V_{s}} represents the spreading speed of the oil particles. Under the effects of advection and the spread from wind, currents, tidal waves, and molecular movement on particles, an oil film area can rapidly change on the ocean surface. Although the wind, currents, and tidal waves all influence the drift speed of the oil particles, the impact of each factor can be significant depending on the simulation terrain. For different situations, the input parameters of the ECOM can be customized according to the target region.

At each time interval \Delta t, the oil particle displacement \Delta S is calculated by integrating \vec{V_{t}} along time t. To guarantee the accuracy of the ECOM oil leak simulation, a subtime interval \tau t_{k} is utilized to calculate the displacement \Delta S of the oil particles. The displacement \Delta S of oil particles in the time range of \Delta t is as follows: \begin{align*} \Delta S &= \sum V_{t,k}\tau t_{k} \\ \sum \tau t_{k} &= \Delta t \tag{2} \end{align*}

View SourceRight-click on figure for MathML and additional features.where V_{t,k} is the speed vector of the oil particles in the interval of \tau t_{k}. \tau t_{k} must satisfy the following: \begin{equation*} \tau t_{k} \leq \left[ \frac{u_{k}}{\Delta x} + \frac{v_{k}}{\Delta y}\right]^{-1} \tag{3} \end{equation*}
View SourceRight-click on figure for MathML and additional features.
where u_{k} and v_{k} represent the velocity components in the X and Y directions of the moving speed of the oil particles.

Oil particles will diffuse after convection and diffusion in each time interval. Diffusion is also an essential part of the early migration of oil particles. Due to evaporation, emulsion, etc., the quality of the oil particles gradually decreases. When oil particles reach the coast, they will be adsorbed on the coast or partially reenter the waters depending on the coastal conditions. After completing all the calculations, the convection, diffusion, evaporation, and emulsification processes of all the oil particles in a time interval have been completed. Only the temperature, wind, and flow field conditions need to be changed in the next time interval, and the entire calculation process is repeated. For more details, please see [31].

In this article, embracing the advantage of the digital twin framework, the proposed method is utilizing the numerical simulation method, the ECOM, for oil leak accident simulation and the intelligent method for leak source location searching. To dig into the inner connection between different simulation results, an intelligent method, the DQN method, is utilized to lead the agent to approach the oil leak source iteratively. However, there are some problems that obstacle the current method accurately and efficiency detecting leak sources as follows.

  1. The uncertainty in the ECOM model requires to be evaluated. Even though the ECOM model has been applied to simulate several practical incidents, there are still some shortcomings that can influence the quality of the simulation. The rough standard in resolution for input parameters brings in some epistemic uncertainties. For example, the ECOM model assumes that the power and the direction of the wind for the whole research are all the same. The assumption is opposed to the objective principles. These uncertainties can be amplified by the numerical simulation algorithm and mislead the leak source detection.

  2. Training DQN method requires network-tuning iteratively. The DQN method can tackle the more complicated problem than the traditional Q-learning method by utilizing the network to extract the insight pattern between states and actions. The ECOM is utilized to evaluate and guide the variables in the DQN to converge in each iteration. Because the numerical simulation is extremely time-consuming, accelerating training procedure is a significant issue needed to be tackled.

  3. Oil leak source search region is extremely large. The primary principle for the DQN method is to manipulate the agent to approach the leak source in the region under the principle of the Q-network. The agent searches the leak source location by a series of actions, which guild the agent to walk in the search region. The trajectory of the agent connects from the initialized location to the target location. The length of the trajectory somehow depends on the original start location of the agents. The long search trajectory takes more execution time than the short one. Randomly initializing the start location of the agent makes the efficiency of the oil leak location search method volatile. Thus, how to steadily improve the efficiency of the DQN method is a significant problem that requires to be tackled.

To tackle the first issue, the Monte Carlo method is proposed to estimate the uncertainties of the input parameters, such as the historical wind field, through iteratively sampling. The Monte Carlo sampling method generates more potential input parameters instances and brings in more possible simulation results into consideration. To accelerate the training procedure of the DQN method, a pretrained problem is designed to reproduce the oil leak source detection problem. Before the oil leak accident, the network can be trained in advance and, when the accident happens, the trained knowledge can be transferred to tackle the leak source detection issue. For the third issue, the lookup-table technique is designed to simulate different oil leak scenarios in the target region. The efficient query lookup table procedure can significantly reduce the search area. Thus, the Monte Carlo technique, the DQTN, and the lookup table technique are proposed to tackle both the accuracy and the efficiency issues in the oil leak source detection.

SECTION III.

Data and Proposed Method

The Monte Carlo-driven OLD method utilizes the ECOM oil leak simulation module and the DQTN method to optimize the oil leak source location. The Monte Carlo statistical technique is a stochastic method that is utilized to determine the uncertainties in the simulation. In the following, the study area in this research is first introduced in Section III-A. The entire workflow is then introduced in Section III-B. The four primary parts, which are the oil leak location extraction and ECOM simulation setup, the Monte Carlo oil leak evaluation, the oil leak lookup table design, and the DQTN optimization method are illustrated in the Section III-C–​III-F, respectively.

A. Study Area and Remote-Sensing Data

A severe oil spill accident in the Penglai region of the Bohai Sea occurred on June 4, 2011, causing more than 5500 square meters (accounting for 7% of the total area of the Bohai Sea) to be affected by oil spill pollution. The accidental oil leak had a terrible impact on the surrounding aquatic industry in an area roughly defined by lat 37^{\circ }41^{\circ }N, long 117.5^{\circ }122.5^{\circ }E. Oil well 19-3 in which the oil spill occurred during this incident is located in the 10/05 area in the southern part of the Bohai Sea. The oil spill region was located at approximately lat 38 ^{\circ }22^{\prime }N, long 120^{\circ }01^{\prime }E.

The accidental oil spill was monitored by the Earth observation mission, environmental satellite (Envisat), of the European Space Agency. Envisat was launched on March 1, 2002, and continues to orbits the Earth [32]. ASAR, which works in the C-band, was equipped on Envisat. The wide swath (WS) mode of ASAR, with a 5.6-cm wavelength, is specially designed for offshore oil spill detection. It provides a more extensive monitoring range and radiation accuracy for oil film extraction [33].

B. Monte Carlo-Based DQTN Offshore Oil Leak Detection Method

The overall workflow of the proposed method is shown in Fig. 2. In the proposed oil leak digital twin framework, data collected in the physical space are utilized to describe the oil leak accident in virtual space. The collected data can be classified into two types. The data collected from the Bohai sea environment includes the shorelines, water depth, tide, and climate data. The oil leak accident-related data includes the start leak time, leak last time, oil film area, and leak region. The numerical analysis in virtual space theorizes the oil leak source location information and provides the reference of leak source location for pollution prevention after an accident.

Fig. 2. - Flow of the proposed Monte Carlo-driven DQTN-based oil leak detection method.
Fig. 2.

Flow of the proposed Monte Carlo-driven DQTN-based oil leak detection method.

In the cyber space, to guarantee the high efficiency of the proposed method, a look-up table is designed to simulate different oil leak incidents in the Bohai Sea comprehensively. The marine surface's oil film is first captured and sent to the database by remote sensing. After a series of preprocessing, including geometric correction, coordinate transformation, etc., the support vector machine (SVM) method extracts the oil leak region from the remote-sensing data. Note that the geographic coordinate system is transformed into WGS84, and the geometric precision correction and an enhanced Lee filter are applied for ASAR data process [28], [34], [35]. The oil film is then discretized to a large number of points based on the remote-sensing data. Based on the extracted oil film, the oil leak source search area can be determined into a small area by the look-up table. Before tackling the practical oil leaking data, a pretraining environment is designed based on the oil leak accident to train the DQN to enhance the search performance. A large number of parameters defined in the Q-learning network is trained in the pretraining environment. Then, the trained network of DQN is transferred for leak source detection. For the oil leak accident, starting at the random point as a candidate oil leak source. The location selection principle is based on the trained DQTN method. For each location, the leaking scenario is evaluated. Based on these evaluation results, the DQTN method can guide the candidate oil leak sources converging to the target of the source location. The oil leak evaluation components include the ECOM numerical simulation method and the statistical Monte Carlo technique. The ECOM method simulates the accidental oil leak in the ocean based on the current leak source. The Monte Carlo technique is utilized to handle uncertainties of the ECOM model. After iteratively searching for the supporting pollution measurements, the oil leak source location can be determined by the proposed method.

C. Marine Oil Leak Detection and Numerical Simulation Model Setup

In this work, to comprehensively evaluate an oil spill, historical marine data are collected and fed into the proposed method. ASAR data provided by Envisat is utilized to extract the oil film on the marine surface. Historical marine environmental data are fed into the numerical oil spill simulation model. The details of the data preprocessing are presented in this section.

As shown in Fig. 3, the petroleum leaked, rapidly diffused, and accumulated in the offshore region on June 11, 2011. The dense oil film has a strong pattern with massively less grayscale than the marine water region, as shown in Fig. 3(b). The oil region and marine water can be classified into two classes to abstract the oil film extraction problem. The target problem is transformed into a binary classification task. The supervised method, SVM, which is an effective classifier for complex and noisy remote-sensing data [36], is utilized to tackle the oil leak extraction problem. Typical pixels of oil film and water background on remote-sensing data are tagged manually. The tagged pixels are called regions of interests (ROIs). The SVM is trained on a small number of ROIs and determines the optimal hyperplane. A hyperplane, which sums up the diversity pattern of the types, is utilized to classify the remained untagged pixels. Note that the kernel trick, the radial basis function, is selected to map the pixels into a higher dimension for classification convenience. The radial basis function is formulated as \begin{equation*} R\lbrace X_{1}, X_{2}\rbrace = e^{-\frac{||X_{1} - X_{2}||^{2}}{\gamma ^{2}}} \tag{4} \end{equation*}

View SourceRight-click on figure for MathML and additional features.where X_{1} and X_{2} are the support vectors of each pixel, and \gamma is a hyperparameter [37]. Similar to the K-nearest neighborhood algorithm, the radial basis function has an advantage in decreasing the space complexity problem [38]. After classification, the remained ROIs are utilized to compute the accuracy and the kappa coefficient to verify the effectiveness of the classification method.

Fig. 3. - Remote sensing captured by Envisat in the Bohai Sea on June 11, 2011. (a) Spilled oil accumulated on the marine surface in the shape of an irregular curve. (b) Three-dimensional remote sensing, whose z-label depends on the gray value of each pixel.
Fig. 3.

Remote sensing captured by Envisat in the Bohai Sea on June 11, 2011. (a) Spilled oil accumulated on the marine surface in the shape of an irregular curve. (b) Three-dimensional remote sensing, whose z-label depends on the gray value of each pixel.

In addition to the remote-sensing data preprocessing and classification, the history marine data are also vital, as they are required to be fed into the ECOM marine numerical simulation model. Based on the principles described in Section II, putting up the dynamic analysis in the simulation region is the primary theoretical solution. The research region requires discretization into a number of grids. Based on it, the history marine data are distributed in each grid. Interpolation is utilized to tackle the resolution difference problem between the historical statistical data and the grids. Marine history data, including temperature, salinity, wind field, and current, are fed into the ECOM. In addition, to simulate the oil leak process, the oil leak source coordinates and the amount of oil leak petroleum are required to feed into the proposed model. After setting up the total simulated time length and simulated unit time interval, the ECOM can start to simulate the wave movements and oil particle trajectories in the marine environment. Note that the start time of the numeric simulation is started two days in advance. The force is initialized on the open boundary of the grids. Two-day “warming up” simulation transmits momentum and kinetic to the grid, which covers a vast area. The “warming up” procedure guarantees the dynamical state for the holding grid similar to the practical situation at the oil leak start time. This procedure can improve the oil particle trajectory estimation accuracy.

D. Monte Carlo Technique-Based Oil Leak Evaluation Method

The Monte Carlo technique is a statistical method to tackle the uncertainty in numerical simulation modeling. Marine numerical simulation models attempt to reproduce the historical status of the ocean. Because of the broad simulation region and relatively unreliable long-term buoy sensors employed, high resolutions and proper environment parameters are limited with historical ocean data. Incorrect input parameters significantly influence a simulation's effectiveness. From a statistical aspect, the Monte Carlo technique decreases the uncertainty in numerical simulation problems by massively sampling the input parameters. In this section, the Monte Carlo technique for the ECOM input parameters is introduced in detail.

The definition of uncertainty in a numerical simulation model is the primary obstacle to reducing uncertainty influences. Because of the limitation of the data acquisition sensors, deviations will exist in the data from the sensors. Sampling is the primary methodology of the proposed Monte Carlo technique. The Monte Carlo technique decreases the uncertainty by iteratively sampling and validating. Note that the sampling principle is based on the particulars of the issue. For instance, if the uncertainty is brought in by the measurement error of the input parameter, samples have to generate according to the precision of the data acquisition equipment. P^{m} represents the input parameters collected by the buoy sensors. Due to uncertainties, P^{m} is not accurate and misleads the ECOM simulation results and \Delta D represents the uncertainty factors. The target is to reproduce the oil spreading state \phi. Based on the P^{m} and \Delta D, the no bias estimation for the oil spill \hat{\phi } can be calculated as \begin{equation*} \hat{\phi }= E[\Lambda (P^{m}, \Delta D)] \tag{5} \end{equation*}

View SourceRight-click on figure for MathML and additional features.where \Lambda () represents the numerical simulation, the ECOM, of the oil particles' moving trajectory. The measurement parameters P^{m} and the distribution of the uncertainty factors \Delta D are fully considered and fed into the ECOM to calculate the expectation as an estimate of the accurate result.

Assume that the uncertainty is based on a normal distribution. The standard deviation of the uncertainty evaluation function is \sigma. To estimate the unknown accurate value, (P^{m}_{i}, \Delta D_{i}), which represents the measurement and its error of input parameter i, is formulated as \begin{equation*} (P^{m}_{i}, \Delta D_{i}) = \frac{1}{\sqrt{2\pi }\sigma }\exp\left(-\frac{\left(x-P^{m}_{i}\right)^{2}}{2\sigma ^{2}}\right). \tag{6} \end{equation*}

View SourceRight-click on figure for MathML and additional features.Note that the input parameters, according to the ECOM, the wind force, wind direction, salinity, and temperature can be considered in the Monte Carlo technique.

In this article, for the ECOM oil leak simulation, the uncertainty is predominantly brought in by the input parameters. The margin of the error influences the accuracy of the sensor measurement data. For an adequate sensor, the error is floated intolerance, and most errors by measurement are slight. The measurement data are the significant references, even though measurement error exists. The normal distribution is precisely suitable for the error estimation. Through the standard deviation, most generated data are close to the measurement data P^{m}_{i}. The Monte Carlo technique generates a series of input parameters (P^{m}_{i}, \Delta D_{i}) by a normal distribution that quantifies the uncertainty while preserving the original measurement data.

E. Oil Leak Detection Lookup Table Design

The lookup table is a high-dimension matrix that is an effective reference for a similar problem. For a complex problem, the effect factors are diversity. The factors determine the look-up table's dimension and represent the index for retrieving value in the look-up table. The advantage of the look-up table is high decreases the execution time while tackling the same type of new coming issue. The look-up table is widely utilized in the computer-aided design domain [39]–​[41]. Due to the oil leak detection target having significant demand on the efficiency, the look-up table technique is an effective method by preparing some instances for references in advance.

The setup and query procedure of the oil leak lookup table is shown in Fig. 4. For the oil leak practical accident, a diversity of parameters can influence the leak trajectory. The look-up table method can comprehensively take these factors into consideration. Note that the geographic information, such as coastline, bathymetric, etc., is entirely different for different regions, which has not come into the thinking of this research. For the specific research domain, the lookup table set up the grid, as introduced in Section III-D. Then, a large number of instances are simulated offline. The simulation results, which are the oil particles' coordinates in each time slot, are stored separately. According to the influence factors, which include the wind field W, temperature T, salinity S, leak location L, and start leaking time \tau, the results are stored as the tree structure in the look-up table. Up till now, the look-up table is already constructed accomplish. When a practical accident happens, the environmental factors are collected by the sensors. By querying the look-up table, the simulation results of similar instances for different leak locations are obtained. The simulation results are evaluated based on the oil leak film in the remote sensing data. The oil leak location with the highest evaluation accuracy is selected to decrease the search region for the DQTN method.

Fig. 4. - Workflow of the oil leak look-up table setup and query.
Fig. 4.

Workflow of the oil leak look-up table setup and query.

For the query procedure of the oil leak lookup table, the main target is to narrow the leak source search area. Except for the leak location, other influence factors are the index address to query the leak area in the target time slot. The principle of the fuzzy query is formulated as \begin{align*} &\Gamma = \omega _{t} * |T_{a} - T_{n}| + \omega _{s} * |S_{a} - S_{n}| + \omega _{\tau } * |\tau _{a} - \tau _{n}| \\ &+\omega _{w}*\left|\!\sqrt{(W_{a\tau }\!-\!W_{n\tau })^{2}+\cdots \!+\!(W_{a(\tau +m*\nu)}\!-\!W_{n(\tau +m*\nu)})^{2}}\!\right| \tag{7} \end{align*}

View SourceRight-click on figure for MathML and additional features.where \Gamma _{n} represents the deviation between the target state and the group of index value n in the look-up table. T_{n}, S_{n}, \tau _{n}, and W_{n} represent the temperature, salinity, start leak time, and wind field in group n. T_{a}, S_{a}, \tau _{a}, and W_{a} represent the query state for temperature, salinity, start leak time, and wind field. \omega _{t}, \omega _{s}, \omega _{\tau }, and \omega _{w} represent the weight of each influence factor. In particular, the standard deviation is utilized to measure the similarity between the target wind field and the example wind field in each time interval \nu. \tau + m \times \nu represents the whole time slot for the oil leak accident. The formula temps to select the most similar sample plate for reference. The minimized \Gamma is the optimal state in the lookup table. The location selection is formulated as \begin{align*} &\mathop{\rm{argmax}}\limits _{j \in N^{j}} {\text{Acc}}_{j} \\ &s.t. \\ &\ \ \ \ \ \ \ \ \hat{T}, \hat{S}, \hat{W}, \hat{\tau } = \mathop{\rm{argmin}}\limits _{i \in N^{i}} \Gamma (C_{i}, C_{a})\\ &\ \ \ \ \ \ \ \ C_{a} = \lbrace T_{a}, S_{a}, \tau _{a}, W_{a}\rbrace \\ &\ \ \ \ \ \ \ \ C_{i} = \lbrace T_{i}, S_{i}, \tau _{i}, W_{i}\rbrace \\ &\ \ \ \ \ \ \ \ R_{j} = \Lambda (\hat{T}, \hat{S}, \hat{W}, \hat{\tau }, L_{j}),\ j \in N^{j} \\ &\ \ \ \ \ \ \ \ \text{Acc}_{j} = F_{mc}(R_{j}, RS_{a}),\ j \in N^{j}\tag{8} \end{align*}
View SourceRight-click on figure for MathML and additional features.
where C_{i} and C_{a} represent the leak instances in the look-up table and the practical leak accident. F_{mc}() represents the evaluation equation which is introduced in Section III-D. \Lambda () represents the numerical simulation by utilizing the ECOM model. RS_{a} represents the extracted oil film from the remote sensing data. N^{i} and N^{j} represent the groups of environmental factors and the number of leak locations. \text{Acc}_{j} represents the accuracy of the simulation result. The environmental factors, T, S, W, and \tau, are queried by the fuzzy query function \Gamma in (7). \hat{T}, \hat{S}, \hat{W}, and \hat{\tau } are the output of the query function. Then, for each leak location L_{j} in the look-up table, the simulation result R_{n} can be queried. The leak location is checked by the evaluation function one after another and then select the optimal location \hat{L}, while maximizing the accuracy \text{Acc}_{j}. The oil leak source search area is effectively narrowed to the point \hat{L} and its adjacent region.

F. DQTN Oil Leak Detection Method

Reinforcement learning is a self-learning framework that assists an intelligent agent in tackling an established issue under a clear system of rewards and penalties. Intelligent agents iteratively attempt to achieve the goals under constraints. After extensive attempts, the agents have learned to tackle diverse issues effectively [42]. The DQN is a reinforcement learning method that utilizes deep learning as the decision component in the Q-learning method. The DQN has been targeted at addressing various problems in different fields in recent years, such as path-scheduling problems [43], [44], traffic flow control [45], [46], infrastructure distribution planning [47], [48], etc., and have achieved good results. The advantage of the DQN algorithm is that it only needs to define the starting state, ending goal, and problem rules (that is, the solution set space and reward and punishment mechanism), and it can perform repeated calculations spontaneously under the logic of reinforcement learning to achieve the expected end goal requirements [42]. Because of its automatic optimization capability, it can reach a solution even if the solution set space is large.

For the oil leak source detection problem, efficiency is the primary issue that obstacles the DQN technique tackling the problem. The DQN technique is utilized as a decision-making tool. It provides advice to adjust the location of the agent to approach the oil leak source. The DQN method builds the reward and punishment strategy to evaluate the simulation result. Although the possible solution set is large, the target is clear; the DQN algorithm can be applied to solve the current problem. However, the DQN technique requires sufficient iterations in the training process to optimize the neural network parameters. The DQN algorithm will mislead the agent direction, delaying the detection of the oil leak source. Therefore, the DQTN is proposed in this section to tackle the oil leak source detection issue.

The procedure of the DQTN oil leak detection method is shown in Fig. 5. A DQN is composed of an evaluation and target net. A pretraining environment instance is designed based on the oil leak source detection issues. The oil leak source detection is essentially a search problem that is similar to the maze problem. The DQN technique is pretrained by a large number of maze instances, and the target search experiences are transferred in the evaluation net to tackle the practical oil leak accident. The potential leak source coordinates and its corresponding information at state S_{t} in step t, can be fed into the evaluation net of the DQN technique to obtain the current action a_{t}. The state can be converted into the next state S_{t+1}. By utilizing the Monte Carlo-based oil leak evaluation method, the state S_{t+1} is evaluated. The reward R_{t} of the chosen action a_{t} at state S_{t} is calculated to guild the DQN technique to make further decisions in the following steps.

Fig. 5. - Procedure of the DQTN oil leak detection method from step $t$ to step $t + 1$.
Fig. 5.

Procedure of the DQTN oil leak detection method from step t to step t + 1.

The transfer learning technique is utilized to enhance the efficiency of the proposed method. Transfer learning is a technology to apply the knowledge learned in previous tasks to improve the solution for novel tasks [49]. Dynamic maze instances are designed to approach the oil leak accident issue as source tasks in transfer learning. Neither practical oil leak accidents nor numerical simulation methods are required in maze instances. The search ability of the agent is trained to guarantee the performance in the target domain, which is the oil leak source detection task. Though the source (maze) and the target (source detection) domains are similar, the tasks for these two domains are different due to the target domain requires numerical simulation to reproduce the accident. Furthermore, since both rewards for different states and actions are available, inductive transfer learning is utilized to transfer the searching knowledge. In each maze instance, the locations of the agent and target are randomly initialized. The input of the DQN technique includes the position of the agent, the decision space (that is, the range that can be moved at each step), the solution set space (all the movable ranges), the definition of the reward and punishment rules (in different positions). The output of the DQN technique is the score that the agent gains in each iteration. The DQN technique faces different maze instances in each iteration to enhance the complex practical oil leak source detection problem.

The overview architecture of the DQN algorithm is shown in Fig. 6. The DQN algorithm consists of two same fully connected neural networks and a replay memory pool. The DQN algorithm iteratively enhances its ability to manipulate the agent to make good action decisions by continually interacting with the environment. The evaluation network makes decisions under different circumstances in the environment. The architecture of the network is shown in Fig. 6(b). The environment evaluates the value of the action as a reward and feedbacks the updated state to the DQN algorithm. The replay memory pool, in Fig. 6(c), stores states, actions, rewards, and updated states sequentially for network optimization by the target network. Memories are loaded from the memory pool, and states and next states are fed into two networks to calculate the outputs, and Q value, respectively. The Bellman equation evaluates the temporal difference for Q values. The differences are utilized to optimize the evaluation network decision-making strategy.

Fig. 6. - Overview architecture of DQN. (a) Workflow of DQN. (b) Structure of the neural network. (c) Structure of replay memory.
Fig. 6.

Overview architecture of DQN. (a) Workflow of DQN. (b) Structure of the neural network. (c) Structure of replay memory.

Applying the DQN algorithm to the oil leak detection, the details of the Monte Carlo-based DQN OLD method is presented in Algorithm 1. The DQN algorithm starts with the initial memory unit D, evaluation networks Q_{a}, and target networks Q_{e}. The candidate source location and the initial normalized distance map of the agent are initialized as state S_{0}. Note that the default distances to the target location for every location are set to 1, which guarantees the fairness of every action. The action space can be initialized as A = \lbrace a_{1}, a_{2}, a_{3},{\ldots }, a_{n}\rbrace which defines the direction and step length. The goal location is totally unknown.

Considering the actual situation, the oil spill point is unknown, and the movement between the initially predicted oil spill point and the actual oil spill point is unknown. Therefore, both the state S and the state transition action a should have random attributes. For a state S_{t} at time t, the action a_{t} can be decided by the computational result, Q value, of the evaluation network Q_{a} with a certain probability. Then, the next candidate source location S_{t+1} can be generated based on action a_{t}.

The reward mechanics are different for dynamic maze instances and the practical oil leak case. A corresponding reward with a specific state S_{t} and action a_{t} is calculated for every maze instance. The agent could receive a negative reward if the agent is in a position that has already been passed or is out of bounds. The agent will receive a positive reward if the agent is at the target location. Apart from the above situations, neutral feedback could return to the agent. The practical oil leak case uses the Monte Carlo-based ECOM method to simulate the spread of the oil spill with the predicted point S_{t+1} as the starting point. Then the predicted oil spill at the specified time is compared with the actual remote-sensing image, and the difference can establish the DQN reward and punishment mechanism R_{t}. Note that, despite detecting the target oil leak source, all actions could lead to a negative reward. The distance could determine the reward value. The state S_{t}, action a_{t}, corresponding reward r_{t}, and next state S_{t+1} are saved into the replay memory pool.

The DQN is optimized by cases in the memories. The parameters of the target network are loaded from the evaluation network. Random select groups of memories to, respectively, evaluate the Q values for the current state S_{i} and the next state S_{i+1}. The expectation K_{i} based on the current decision-making knowledge for the next state S_{i+1} can be expressed as \begin{equation*} K_{i} = R_{i} + \gamma \times \text{argmax}_{a_{i}} Q_{e}\left(S_{i+1}\right) \tag{9} \end{equation*}

View SourceRight-click on figure for MathML and additional features.where, \gamma represents the discount factor in decreasing the influence for next state results. The loss of temporal difference is calculated as follows: \begin{equation*} L = \left(Q_{a}(S_{i})|_{a_{i}} - K_{i}\right)^{2}. \tag{10} \end{equation*}
View SourceRight-click on figure for MathML and additional features.

The loss then is backpropagated to update the parameters in the evaluation network Q_{a}. Iteratively update the network until it is well trained. With the updated network and the adequate reward and punishment mechanism R, the candidate oil leak source location is constantly approaching the actual oil leak source point.

Algorithm 1: Monte Carlo-Based DQN Offshore Oil Leak Detection Method.

Require:Initial marine environment parameters around the oil leak area.

1:

Initialize the memory unit of DQN D.

2:

Initialize evaluation network Q_{a} for decision making.

3:

Initialize target network Q_{e} to store the historical decision-making knowledge.

4:

Initialize the start point for the oil leak source location S_{0}.

5:

Set t=0.

6:

Select action by a_{t} = \text{argmax}(Q_{a}(S_{t})).

7:

Update the candidate oil leak source location from S_{t} to S_{t+1} based on action a_{t}.

8:

Simulate and evaluate by the Mento Carlo-based ECOM evaluation method to get the accuracy \text{Acc}_{t}.

9:

Calculate the reward R_{t} by the accuracy \text{Acc}_{t}.

10:

Update memory unit D \leftarrow D + (S_{t}, a_{t}, R_{t}, S_{t+1}).

11:

Randomly select memories from D_{i}(S_{i}, a_{i}, R_{i}, S_{i+1}) \in D.

12:

Load parameters from the evaluation network Q_{a} to the target network Q_{e}.

13:

Calculate value of the historical decision-making knowledge value K_{i} by (9).

14:

Calculate losses between the historical knowledge and current knowledge L by (10).

15:

Utilize the root mean squared propagation method to decrease the loss L.

16:

Update Q_{a} by current Q_{e} for every N times.

17:

t = t + 1

18:

Check the stop criteria according to the accuracy \text{Acc}_{t}. If stop criteria has not been satisfied, repeat steps 6 to 17.

Ensure:The computed oil leaking locations.

SECTION IV.

Experiments

Based on the major oil spill in Penglai, Bohai, in 2011, this experiment verified the accuracy and the efficiency of the Monte Carlo-based DQTN OLD Method. This section introduces the experiment information in detail, including the related data, the basic information of the platform for conducting the experiment, and the design of the experiments. Subsequently, the experimental results are analyzed and illustrated.

A. Experimental Setup

The experiment was conducted with the Python 3.5 and C++ programming languages. The final algorithm was tested on a computer running at eight cores, 16 threads CPU. The base frequency of the processor is 3.6 GHz, and max turbo frequency can be up to 5 GHz. Based on the practical case, the Bohai Sea oil leak incident, this experiment utilized the proposed DQTN-based oil leak detection algorithm to search the location of the oil leak source and verify its efficiency and accuracy.

According to the Bohai Sea oil spill accident in 2011, the ECOM was utilized to simulate the same oil spill process. The area of the Bohai Sea defined by longitude 117.5^{\circ } to 122.5^{\circ }, latitude 37^{\circ } to 41^{\circ }N, was divided into a 180 \times 144 grid matrix containing 25 920 units. Each grid cell is in the shape of a square with 0.0277^{\circ } side length. The search region is a rectangle with a length of 20 grids (0.556^{\circ }) and a width of 30 grids (0.833^{\circ }). Its top-left corner is located at [80, 30] in the whole grid matrix. The oil leak source search area is from No. 80 to 100 grids in 180 longitude grids and from No. 30 to 60 in 144 latitude grids.

The temperature during the accident was 20^{\circ }\text{C}, the salinity of the seawater was 35, and the collected wind field data were obtained from the Quickscat/NCEP dataset providing wind field data every six hours. In this experiment, the density of the ink was set at 0.8\text{g/cm}^{3}. The simulation time starts in June 2, 2011 and ends in June 20, 2011. The time interval for simulating is 360 s. Because the oil leak accident happened at June 4, 2011, the simulator starts releasing 100 oil particles every 30 steps after 481 = (4 - 2) * 24 * 3600 / 360 + 1 steps grid warming up. Groups of oil leak particles continuously release until 2160 = (11 - 2) * 24 * 3600 / 360 steps, which is the remote-sensing data collected time, June 11, 2011. Using the established ECOM, the oil film on the sea surface was obtained for different oil leakage points.

For the look-up table, the oil leak source location and wind field data are taken into consideration. For the oil leak source location, 20 locations are chosen in the look-up table. 80, 85, 90, and 95 are selected in x direction, 30, 35, 40, 45, 50, and 55 are selected in y direction. For the wind data, since the Bohai sea seldom rises north wind, the direction randomly generated among the other three directions. According to the scale of wind force, the wind force is set from 0 to 15\,\text{m/s}. For each oil leak source location, we randomly generate 16 wind field samples in the oil leak look-up table. Other factors, that are required by the ECOM simulator, are generated by experience. Note that, limited by the computing resource, the type of influence factors considered in this research are finally restricted to the leak location and the wind field. With more execution time and better computing power, the look-up table can be more comprehensive with more factors, such as the temperature, the salinity, etc. When queried about the oil leak status in the look-up table, the source location is the priority, then the wind field. The oil leak status is evaluated by comparing it to the remote sensing data, and the accuracy is calculated. The oil leak location in the look-up table, which has the highest accuracy, is selected as the center of the search region for DQTN.

Based on the architecture shown in Fig. 6, the structure of DQN is listed in Table II. States of size 2 \times 5 \times 5, which is reshaped into 50, are initialized for DQN offline training. The location matrix and distance map are composed of the two bands in the state. Actions of the agent include one step up, one step down, one step left, or one step right. The distance map is initialized by 1 and, as the agent moves in the maze, is gradually filled by the distance between the agent and the goal. The size of three fully connected layers are 50 \times 25, 25 \times 16, and 16 \times 4. There are two rectified linear units (ReLU) appended after the first two layers. While training the DQN, the location of the agent and the goal is randomly generated in each interval. The offline training includes 5 \times 10^{5} intervals. The learning rate is set to 0.005.

TABLE I Result for Six Different Cases
Table I- Result for Six Different Cases
TABLE II Layers of DQN and Output Dimensions of Every Layer
Table II- Layers of DQN and Output Dimensions of Every Layer

For the oil leak detection process, the DQTN loads the pretrained model to help agents make action decisions in every state. Eight samples are generated by the Monte Carlo sampling evaluation method. In the simulation process, when the model ran to the time corresponding to the actual remote-sensing detection data, the simulation was stopped, and the current position of each oil particle in the simulation data was recorded to form a simulated oil film area. The current accuracy rate was the ratio of the oil samples extracted from the actual remote-sensing data in the simulated oil film area. For example, at iteration t, 100 oil particles were extracted by remote sensing under actual conditions. At the oil spill point L, the numerical simulation ECOM was used to simulate the oil film area obtained at time t. Only 90 of the 100 oil particles were extracted. Each oil particle represented 90% accuracy at that moment. Note that eight threads are utilized to speed up the proposed method. The algorithm will stop when the accuracy is over 90%.

B. Experimental Results

Leak source detection tasks are critical for an instant hazard prevention decision making [50]. Tackling oil leak source detection problems requires multidisciplinary knowledge, including the numerical model, the remote sensing technique, and the intelligent algorithm [51]–​[53]. However, there is a great difference between the target, dataset, and methodology in these state of arts since the leak accident and leak monitor technique is different. Under these circumstances, these algorithms are not suitable for tackling current issues. Thus, this article evaluates the proposed Monte Carlo-based DQTN Offshore Oil Leak Detection (MC-DQTN-OLD) Method, commonly used methods, the particle swarm optimization (PSO), and the greedy method, which are also applied to the oil leak accident in the Bohai Sea. Targeting to verify the effectiveness of transfer learning in the proposed method, a comparison experiment is proposed to compare the DQTN method to the DQN method. We make the following observations.

  1. The MC-DQTN-OLD method can effectively detect the oil leak source without prior knowledge. Fig. 7 shows a source detection instance for the proposed method. By querying the oil leak look-up table, the search region is reduced from 93 to 97 in the x direction and 48 to 52 in the y direction. The simulation oil pixels gradually approach the oil film on remote sensing data. After four steps, the accuracy of the candidate oil leak source [93, 52] is up to 97.7%, which satisfies the stop criteria. Note that the grid, as a background in Fig. 7, is utilized in the ECOM method for oil leak simulation.

  2. The effectiveness of the proposed MC-DQTN-OLD method has been demonstrated by evaluating different instances with different start locations. In Table I, six instances are established to test the robustness of the method. Seeds for the build-in random function of Python and the random function of the external package, NumPy, are correspondingly settled. Because two different factors required random initialize, the start search location and the Monte Carlo wind field, two different random functions are utilized to control separately. The transfer learning is tested in this experiment. According to these six instances, the average accuracy of the DQN is 16.23%, and the execution time is 477.9 min. The average accuracy of the DQTN is 97.54%, and the average execution time is 184.5 min. The DQN only finds the DQN method one time in the six different cases, while the DQTN finds the leak source location in every case. The DQTN method search for oil leak source location by iteratively adjusting the agent's location. If the initial location is far away from the destination, the execution time will be affected. Since the target locations [93, 52] or [94, 52] are detected, different instances' accuracy of six is close.

  3. The comparison algorithms, the greedy method, and the PSO method, cannot tackle the oil leak source detection issue. The greedy method moves the agent by distance. The “greedy agent” moves in the direction of decreasing distance. The agent can move up to 25 steps. The PSO method is a genetic algorithm that includes six individuals from a population in this research. Note that the stop criteria are the same for these three algorithms. As shown in Fig. 8, the greedy method can not find a proper oil leak location, and the accuracy is only 0%. The accuracy of the PSO method is 77.23%. Comparing to the PSO method, the proposed MC-DQTN-OLD method can achieve up to (97.54\% {-} 77.23\%) / 77.23\% = 26.30\% improvement in accuracy.

  4. The proposed MC-DQTN-OLD method has a significant advantage in the algorithm efficiency when compared to the greedy method and the PSO method. As shown in Fig. 9, the execution time of the greedy method and the PSO method are 1229.68 and 903.95 min. Comparing to the greedy method, the proposed MC-DQTN-OLD method can achieve up to (1229.68 {-} 184.5) / 184.5=566.50\% improvement in efficiency. Compared to the PSO method, the proposed MC-DQTN-OLD method can achieve up to (903.95 - 184.5) / 184.5=389.95\% improvement in efficiency. The proposed method significantly improves the efficiency of oil leak source detection while guaranteeing accuracy.

Fig. 7. - Oil leak source detection instance of the proposed MC-DQTN-OLD method.
Fig. 7.

Oil leak source detection instance of the proposed MC-DQTN-OLD method.

Fig. 8. - Accuracy comparison of the greedy method, the PSO method and the MC-DQTN-OLD method.
Fig. 8.

Accuracy comparison of the greedy method, the PSO method and the MC-DQTN-OLD method.

Fig. 9. - Runtime comparison of the greedy method, the PSO method and the MC-DQTN-OLD method.
Fig. 9.

Runtime comparison of the greedy method, the PSO method and the MC-DQTN-OLD method.

SECTION V.

Conclusion

As a kind of pollution incident that seriously threatens the marine ecological environment, it is essential to take measures in time to prevent further damage. At this stage, research is focused on extracting the oil film area using marine remote sensing images to achieve monitoring effects. However, if the source of offshore oil spills is not located in time to reduce the total amount of crude oil leaked from the source, the scale of marine oil spill pollution cannot be effectively controlled. Therefore, in this study, based on remote-sensing images, the DQTN algorithm was used to study the problem of locating oil spill points in marine oil spills. The main conclusions are as follows.

  1. This research used the digital twin architecture to tackle the oil leak monitoring and source detection issue. The DQTN algorithm is utilized to iteratively move the location of the oil spill point, used a marine oil spill model to simulate the spill, and evaluated the simulation results. This procedure is iterated until the exact oil leakage is obtained as a result, then this point was used as the location of the real oil leakage point. The DQTN algorithm makes full use of the results obtained from each oil leakage simulation in order to make the next decision. It can effectively use fewer resources in a more extensive solution set space to obtain the location of the oil spill point. In addition, the look-up table technique is utilized to improve the efficiency and the Monte Carlo method is applied to tackle the uncertainty in the ECOM simulation.

  2. A major oil spill in the Penglai region of the Bohai Sea on June 2, 2011, caused huge damage to the marine environment and economic losses. Based on historical wind field data, a numerical simulation ECOM for marine oil leakage was constructed. Based on the remote-sensing imagery of June 11, the DQN algorithm was used to search for oil leaks at sea. After testing for different initial oil leaks, the average accuracy rate was 97.54%, the highest accuracy was 98.97%, and the average time was 184.5 min. The effectiveness of the proposed method was confirmed.

This study was based on remote-sensing images, used the proposed MC-DQTN-OLD, and evaluated the results of the oil spill simulation with the marine oil spill ECOM, which iteratively searched to achieve the purpose of locating the spill point of the oil spill event. The accurate depth of the oil spill point can, to a certain extent, assist staff to quickly find the actual oil leakage location, reduce the pollution of the environment from further oil leakage, and reduce the difficulty of environmental recovery after the disaster.

References

References is not available for this document.