Introduction
The Maritime Safety Committee (MSC) of the International Maritime Organization (IMO) defines a Maritime Autonomous Surface Ship (MASS) as “a ship which, to a varying degree, can operate independently of human interaction” [1]. Although a vessel performs several operations simultaneously, in this article we focus primarily on the problem of sensing for autonomy in navigation and situational awareness functions.
Automation of vessel navigation is aimed at increasing safety and efficiency. These manifest differently in the operational phases of a ship. Before departure, a ship route is planned. In this planning, automated route optimization is important with respect to weather conditions, especially in ice covered waters [2], [3]. While steaming, this route plan is altered if safety (or efficiency) can be increased. Autonomy may play a critical role here.
Autonomous systems consist of perception and control elements. On a ship, the perception elements include the ship positioning, RADAR, and other sensors that scan the environment, while control elements include for example the propulsion and steering systems. Control systems for ship maneuvering are well advanced so that even the most difficult propulsion needs can be satisfied with so-called azimuth thrusters. Each thruster incorporates an (often electric) engine and a propeller in an underslung pod [4]. These azimuth thrusters may be rotated without restrictions by 360 degrees around the named angle, enabling even the largest vessels to enter narrow harbors quickly and safely. Moreover, when Global Navigation Satellite System (GNSS) positioning is integrated with the control system in a so-called Dynamic Positioning (DP) system, a vessel can counteract the environmental forces acting on it for the purpose of maintaining its position and heading as close as possible to its working position (without anchor), or it can stay on course and steady rather than get carried away by the fluctuating winds and waves [5]. In contrast to these rather sophisticated control systems, integrated perception systems for maritime environment are still inadequately developed for autonomous operations. There is a need to complement the well-developed RADAR and GNSS1 techniques with other perception sensors and multi-sensor fusion through Artificial Intelligence (AI).
The key benefits of multi-sensor perception systems are increased availability and integrity through complementary sensing, that is, targets that cannot be detected with one sensor may be detectable with another sensor, and redundancy, that is, an observation can be cross-validated from different sources. While multi-sensor perception systems are well-known in the context of autonomous cars [6], mobile mapping [7], airborne and Unmanned Aerial Vehicle (UAV) based remote sensing [8], and robotics [9], the maritime context has received less attention. This is because research in maritime perception systems is hindered by multiple factors. Maritime weather is harsh, especially in the polar regions. The harsh weather makes the maritime environment less attractive for initial sensor research, as the sensor systems need be developed beyond the first experimental phases in order to be weather-proof. Performing research experiments on board a ship may become prohibitively expensive if it interferes with normal ship operations. Ship systems are protected by proprietary interfaces, which means that accessing the data requires sub-contracting and such data is seldom inexpensive. The traditional approach to cyber-security of ships, and in many cases still the norm for safety and security conscious ship owners, is based on the principle that the ship systems are kept isolated from the Internet and ship/company intranets. Even when the proprietary interface can be accessed, the data transfer is usually serial-data and unidirectional. Maritime safety protocols and company Safety Management Systems (SMS) - as encouraged by the IMO [10] - also set boundaries for experimental ship instrumentation, as the functionality of the built-in systems must be guaranteed. However, these factors have not deterred the research community from initiating explorations in this field. A white paper from the Advanced Autonomous Waterborne Applications (AAWA) initiative led by Rolls Royce [11] lists a range of perception sensors with strengths and weaknesses. It also depicts a futuristic vision of the manifestation of autonomous ships.
In recent years, artificial intelligence algorithms have achieved huge success in both academia and industry. Therefore, it is natural to seek the exploitation of AI techniques also in autonomous ship navigation and sensor fusion problems. There are numerous studies and applications on different maritime sensor data using AI [11]–[13], but few focus on high-level situational awareness. In this case, we study from the viewpoint of fusing sensor data with the means of AI methods to provide the required situational awareness and sensor integrity monitoring.
We set out to review the relevant background research, equipment, and methods regarding the perception systems for autonomous ships. This includes reviewing the suitable sensors and artificial intelligence techniques. As there is yet little research done in this field, our approach is to cover also those works that appear to be outside of the main area of interest but that introduce methods that are likely to be useful in constructing multi-sensor perception systems for maritime environments.
The paper is structured as follows. First, we present the state-of-the-art in this technology domain and a review of the regulations relevant for autonomous vessels. Second, we review the Key Performance Indicators (KPIs) for autonomous vessels and translate these into operational requirements. Third, we review the sensor technologies that are relevant with respect to these indices. Fourth, as the sensors put out data in several different formats, we review the artificial intelligence techniques that have been successfully applied in fusing multi-modal data. Lastly, we conclude the paper with recommendations about future work.
Background
Maritime transport is experiencing similar types of evolution towards autonomous systems as road, rail, and air transport. The business drivers are largely the same, that is, autonomous and intelligent systems are foreseen to reduce risk of human (operational) errors and its associated cost, and enable new types of robotic operations which can, when properly implemented already in the design phase of the system and vehicle, also reduce building and operating costs.
A. Research and Projects
Autonomous vessel research and projects have had a considerable concentration in North Europe and more specifically Norway and Finland. Lately also Asian countries have entered this race and we see activities emerging in China, Japan, Korea, and Singapore. The main activities are usually concentrated around a larger group of companies and research bodies, which is a clear indication of the willingness to cooperate in this industry. The best-known endeavours include the MUNIN project [14], the AAWA project [11], the Finnish OneSea Ecosystem2 with its Jaakonmeri test area, the Norwegian Yara Birkeland3, SIMAROS, AUTOSEA [15] and ROMAS projects, the Chinese Unmanned Cargo Ship Development Alliance (UCSDA) and test areas, Japanese NYK’s plans for remote operation, and Singaporean endeavors related to autonomous maritime operations.
Other notable activities include test areas and joint industry projects in Belgium such as the De Vlaamse Waterveg’s Smart Shipping initiative4, the Dutch Joint Industry Project Autonomous Shipping5 and Smart Shipping Challenge (SMASH)6 in the Netherlands and the Danish ShippingLab7. We also note the Smart Ships Coalition Marine Autonomy Research Site (MARS)8 in the Great Lakes in Canada/USA, the UK Maritime Autonomous Systems Working Group (MASWRG) and its UK Code of Practice for autonomous ships [16], as well as governmental work by the IMO on the topic of MASS and what needs to happen in the regulatory space [17]. Other relevant work is published by the European GNSS Agency (GSA) regarding User Needs and Requirements related to GNSS [18], white papers and studies by Futurenautics [19], and the Comité Maritime International (CMI) [20] on the topic of maritime law.
B. Situational Awareness Concepts
Various international projects have presented and outlined concepts for autonomous vessel navigation and situational awareness.
The MUNIN-project [14] looked widely at the whole concept of an autonomous cargo ship and presented also a high-level software architecture for autonomous vessel control. The main principle included an Autonomous Ship Controller (ASC) and a Shore Control Centre (SCC) and involved the idea of an Advanced Sensor System (ASS), which feeds data to an ASC and an Autonomous Navigation System (ANS), as shown in Figure 1. The concept also included a Autonomous Engine Monitoring and Control System (AEMC).
MUNIN-project - context and module diagram for autonomous ship control (reproduced from [14]).
The AAWA-project [11], one of the main up-to-date reference studies, outlined an ANS structure based on four modules, and this general structure is likely to be found in future autonomous vessel systems. The ANS was envisioned to consist of a Situational Awareness (SA) module, a Collision Avoidance (CA) module, a Route Planning (RP) module and a Ship State Definition (SSD) module. The ANS was further thought to be linked to a DP system, which in turn controls the ship’s propulsion and steering, as shown in Figure 2.
The Norwegian AUTOSEA-project [15] (Sensor fusion and collision avoidance for autonomous surface vehicles) has also presented a system concept based on a sensor fusion module consisting of imaging and navigation sensors, which together with AIS and external chart material produce target tracking data for a separate collision avoidance module linked to control system(s), as shown in Figure 3.
C. Regulations, Standards, and Practices
Looking at how autonomous vessels and autonomous vessel technology is envisioned to work, we can conclude that, from a purely technical perspective, there exists already today a vast amount of performance standards and regulations concerning various types of sensor systems on board conventional seagoing ships. On the other hand, there are very few standards and regulations specifically targeted to autonomous ships, and the main actors in this arena - concerning new regulations and practices - seem to be the classification societies and the industry itself.
The main actors and entities for regulations, standards and guidelines are:
IMO – International Maritime Organisation
IACS – International Association of Classification Societies
The various classification societies themselves
ISO – International Organization for Standardization
IEC – International Electrotechnical Commission
ITU – International Telecommunication Union
IALA – International Association of Marine Aids to Navigation and Lighthouse Authorities
CIRM – International Association for Marine Electronics Companies
GSA – European GNSS Agency
Of the above listed actors, we note that IMO is already working on its MASS strategy [17], where one of the first steps has been to do a comprehensive scoping exercise into existing regulations. And of the Class societies, American Bureau of Shipping (ABS) is active in cyber security, while Bureau Veritas (BV), Chinese Classification Society (CCS), Det Norske Veritas (DNV GL), Lloyd’s Register (LR), and ClassNK have all published guidance related to autonomous vessels and systems. ISO, ITU, IALA, and CIRM are not specifically focusing on autonomous vessels, but many of their existing technical standards are applicable as such also to these new vessel types and their associated systems. GSA has quite recently published a comprehensive study [18] on the Position-Navigation-Timing (PNT) and GNSS user needs and requirements, factoring in also autonomous vessels. This study of existing regulations and standards feeds into establishing operational requirements for autonomous vessels, as described in Section III.
Requirements for Situational Awareness in Autonomous Vessels
The human navigator on-board the vessel will trust the technical devices such as differential-GNSS, Electronic Chart Display and Information System (ECDIS), and RADAR to a varying degree, while especially older and extensively experienced captains will have the least trust in any single device on-board [3]. The captains tend to do continuous cross-referencing of the ship’s location based on RADAR targets, ECDIS screen (which derives its location from the DGNSS), and visual observation of the view outside of the ship. This means that a similar functionality will be implemented also on autonomous vessels, that is, create a system which itself primarily distrusts any single data source and prioritises cross-referencing and verification of data from multiple sensors.
Therefore, Situational Awareness (SA) means the autonomous vessel will be able to recognize the presence of certain objects along its intended maritime route using one or more of the installed sensors. It will also identify and classify these objects, and possibly perform ranging and validation by ensuring the results tally within a redundant subset of the sensors.
We therefore, study requirements from three different viewpoints; namely requirements for vessel PNT, requirements for sensing, and requirements for the AI and Machine Learning (ML) software.
A. Requirements for Vessel PNT
Requirements for positioning have been derived by studying existing IMO requirements, DP requirements and the GSA study of user needs. The proposed consolidated positioning values for Port approach and Coastal area operations are presented in Table I. The first line of each requirement shows the current IMO requirement level, while the second line indicates the proposed and achievable target value for autonomous vessels as well as what this proposed value is based on. For example regarding fix intervals the IMO requirement is one fix each second - 1 s (IMO) - while we claim that an achievable and acceptable level - to be tested with a Proof-of-Concept (PoC) equipment - for an autonomous vessel is two fixes each second - 0.5 s.
B. Requirements for Sensing
Requirements for sensing are somewhat more difficult to define, since these are largely unregulated. However, based on data from past studies - mainly MUNIN - and IMO sources we present proposed consolidated values for target detection range in Table II. In addition to detection range, there exists also other requirements which mainly relate to practical needs, such as being able to detect targets simultaneously with more than one sensor and the need to calculate trajectories and categorise targets with sufficient accuracy. An autonomous vessel AI-function will also need to have a working output interface system for sharing target data with downstream components.
C. Requirements for AI/ML Software
The AI software and ML functions are used to fuse data from different sensors, process this data, provide target detection and classification, analyse the data with regard to the operational requirements and perform an output of the results. The AI-function has therefore the following key requirements. It should have the ability to:
work offline from the Internet, once properly trained locally or in the cloud,
compute SA results in online mode within reasonable time (less than 60s),
both detect and classify targets with reasonable accuracy (true and false positive, true and false negative),
analyse input sensor and its own performance and result quality (integrity monitoring of sensor assembly),
work with Commercial-Off-The-Shelf (COTS) hardware as well as custom equipment,
output data in a suitable format.
Review of Sensors for Autonomous Vessels
A. Proposed Architecture for the Experimental Sensor Assembly
A comparison between the different maritime situational awareness sensors is shown in Table III. The original version of this table is available in [11]. The experimental sensor assembly proposed here includes 4 sensor families: visual (cameras), remote sensing (RADAR and LiDAR), audio (microphones), and localization (satellite navigation and Inertial Navigation System (INS)). In addition, AIS broadcasts can be integrated along with maritime data from other external databases. The most prominent AI techniques are related to image and sound processing, namely the detection and identification of objects or features embedded inside data snapshots generated by the different sensors.
In theory, there are a number of other potential sensors which can be included here, for example, depth-sensor, 3-dimensional Sound Navigation and Ranging (SONAR), Radio Direction Finder (RDF), Automatic Dependent Surveillance – Broadcast (ADS-B), and visibility meter. Some of these sensors already exist on board many ships. However, due to limitations of space and time we do not include them in this study.
A potential deployment strategy for the sensors is shown in Figure 4. This strategy assumes that the autonomous vessel is a ship of size greater than 12 m, although it can be easily adapted to other vessel types as well. GNSS/INS provides the absolute position for the ship. For sensor fusion, the detection zone can be divided into two regimes: (1) long range - from about 1 NM and above, where AIS, RADAR, and stereo cameras (in good weather conditions) are relevant, and (2) close range - below 1 NM, where LiDAR, cameras, and microphones are applicable. Cameras can provide overlapping functionality between the two regimes. The overall processing strategy in our suggested deployment is: (1) an object is detected at long range with conventional ship RADAR and can be matched against camera and AIS observations in terms of position, velocity, and heading, as stereo cameras enable the calculation of distances. (2) On close range, LiDAR is used in conjunction with AIS, cameras, and sound sensors. The data integration plan for the proposed sensor assembly is depicted in Figure 5.
Potential deployment strategy and spatial range of operation for the on-board sensors of an autonomous vessel.
B. Sensors for Precise Absolute Positioning
Global Navigation Satellite Systems (GNSS) have become the primary source of position and timing information for the vessel bridge. GNSS currently offers users a number of constellations (American GPS, European Galileo, Russian GLONASS, and Chinese BeiDou) each transmitting more than one signal on separate frequencies. Additionally, augmentation systems and sophisticated techniques for error mitigation aim to increase the quality, availability, and integrity of the Position-Velocity-Time (PVT) solution [21]. The COTS GNSS receivers available today vary significantly in their capabilities depending on the target market segment.
Table IV lists some example receivers relevant to the maritime domain. The design of maritime-oriented GNSS receivers enables their use also in harsh weather conditions. Usually, the receiver and antenna are assembled in one rugged enclosure to be mounted in a place with a good sky view. Some receivers support two external antennas for heading determination [22], [23]. However, a common technique to determine heading, roll, and pitch is to use an Inertial Measurement Unit (IMU) integrated with the GNSS receiver [24]. Such integration also improves the position accuracy under dynamic conditions. The standards used for interfacing the PVT data to a computer or a display unit are typically NMEA 0183 and its successor NMEA 2000 [25] developed by the National Marine Electronics Association. At the moment of this writing most of the maritime receivers support both of these standards.
From Table IV it can be observed that more and more maritime receivers are starting to provide support for multiple constellations, multiple frequencies, and precise positioning techniques. We believe that these features are necessary assuming that operational requirements for positioning accuracy and availability; and robustness to interference are expected to become more stringent in future autonomous vessels. For example, using Real-Time Kinematic (RTK) is likely to be useful in port approaches where the vessel proximity to the land-based GNSS base station is within the recommended limits.
Support for multiple constellations significantly increases the number of potentially visible satellites and, consequently, the availability of navigation information. The ability of a receiver to track multiple signals from the same satellite is essential for improvement of positioning accuracy by compensating the frequency dependent ionospheric delay. Additional signals have other advantages as well, provided by the unique design of their signal structure. The most significant benefits are towards increased resistance to interference. Furthermore, support for RTK enables the receiver to provide centimeter-level accuracy by utilizing corrections generated by either National, regional, or international publicly owned or commercial Continuously Operated Reference Station (CORS) networks.
C. RADAR and LiDAR
Radio Detection and Ranging (RADAR) and Light Detection and Ranging (LiDAR) measure ranges using radio frequencies and visual or infrared light, respectively. Ranging devices have an emitter that transmits signals and a receiver that measures Time-of-Flight (ToF) delay and arrival direction of pulses reflected from target surfaces. Intensity of the signal that is reflected from a given target depends on the target characteristics, such as reflectivity and size, that is, its cross section.
A significant difference between RADARs and LiDARs is in the spatial dispersion of the signal. RADARs use relatively wide beam width antennas making it very difficult to distinguish small structural details of the target. Modern LiDARs, on the other hand, are almost exclusively based on lasers and consequently have very narrow and well collimated beams. Hence, LiDAR can construct a more detailed model of the target, even from a distance. The downside of LiDARs is that they are very susceptible to weather phenomena, for example, precipitation. In contrast, radio waves penetrate clouds, smoke, and fog better than visual wavelengths and therefore, RADARs are the obvious choice for the main long range remote sensing system on-board ships.
International convention mandates the use of X-band RADAR on ships of 300 gross tonnage and above. A second, typically S-band RADAR is required on ships with 3000 gross tonnage or more. Table V lists the basic differences between S- and X-band RADARs [26].
Typical minimum operating range of a marine RADAR depends on the vertical beam width of the RADAR antenna and hence, on the height between the RADAR and the target. For larger vessels, in which the RADARs are usually placed at the top of the ship, the minimum range can be several hundred meters. The maximum detection range for a given target depends on the receiver sensitivity, and either on the transmitter power for continuous wave RADAR or the emitted pulse energy for pulse RADAR. It is common to have the RADAR horizon beyond the visual horizon due to refraction of radiowaves in the atmosphere. The range can therefore easily exceed 10 NM for typical RADAR and surface target heights.
The implementation and technology of RADARs used in commercial and civilian vessels have remained relatively unchanged for several decades and can be considered mature and reliable. The most significant recent trend lies in the increasing adoption of fully solid state transmitter designs. These systems overcome reliability and controllability issues in traditional magnetron-based designs, allowing novel and more agile signal processing methods, for example, pulse compression. With pulse compression, range resolution and target detection can be improved. In addition, doppler measurements can be performed from single echo pulse. Furthermore, solid state transmitters are considered more stable with less internal noise, allowing increased sensitivity. New state-of-the-art technologies in other RADAR applications include phased array antennas, enabling electronical scanning and beam steering. While such systems and technology certainly provide better performance, it is unclear whether the impact on navigational safety would offset the significantly higher cost and complexity of these systems.
Since the class of vessels for autonomous navigation considered in this study will most probably have at least one mandatory RADAR, it seems reasonable to utilize these existing RADARs to the largest possible extent. Consequently, as the existing RADARs fulfill the mandatory requirements and are one of the pivotal equipment relied on for safe navigation, we expect the currently available systems to be sufficient for autonomous systems. We will therefore skip a detailed discussion on suitable RADAR selection for autonomous vessels in this article – and discuss LiDARs.
There are reasons why LiDARs are not currently used in maritime detection. For instance, their use is limited by the constraint that the laser power usually cannot be increased due to eye-safety issues. Lower cost commercial LiDARs are typically geared towards automotive applications, where the range requirements are below 300 m and the main design goals are size and cost. For these LiDARs, the typical operational range is from 0.1 m to 200 m for targets with 80% reflectivity. For darker targets, the range decreases rapidly. Since the range of these devices is limited, they typically have rather low angular resolution. This also limits the operational range, as smaller targets can pass undetected when they are sufficiently far. Therefore, these LiDARs can be considered inadequate for larger vessel, but could suffice for smaller slow moving vessels.
Longer range LiDARs are manufactured for e.g. geological survey purposes, and these instruments can achieve measuring distances of several kilometers. This increase in the range is typically achieved with larger and more efficient collection optics. Unfortunately, this optics size-to-distance relationship is exponential, making optical improvements increasingly expensive. Therefore, the LiDAR research has focused on improving photosensor sensitivity and read-out electronics noise margins. Currently, the research on LiDAR optics and electronics is shifting on single photon techniques [27]. The followup efforts (e.g. [28]) may well lead to significant advances also for maritime detection, even if advances are not achieved within the more traditional pulsed- and continuous-wave techniques.
When considering LiDARs for maritime environment, in addition to the measuring range and angular resolution, of particular interest is the scanning pattern or the horizontal and vertical Field-of-View (FOV). The first commercial LiDARs utilized rotating optics which allowed 360 degree view horizontally, while vertical FOV was limited by the number of discrete laser transmitter-receiver pairs on the unit and their angular separation. Due to the cost of implementing freely rotating optics, more recent LiDAR designs have adopted different scanning techniques which generally have more limited FOVs. Novel LiDAR technologies, such as flash LiDARs and optical beam steering, are inherently limited in FOV. Consequently, several units need to be employed if 360 degree FOV is required. For smaller vessels, 360 degree scanning patterns might be suitable as the LiDAR can be placed at the top of the ship without significant blind spots around the ship. For larger vessels, the blind spots can become significant, and hence, several discrete LiDARs would have to be employed around the ship for full coverage.
Secondly, if using LiDARs with 360 degree view, a significant amount of the scanning time is lost due to the ship blocking the view. Furthermore, the corresponding sections would likely need to be removed in real-time from the point cloud data due to bandwidth issues. Therefore, it seems reasonable to utilize multiple discrete LiDARs with limited FOVs dispersed around the vessel. This also allows a finer control over the resolution in pivotal directions, e.g. the front of the ship.
When considering LiDAR selection for autonomous vessels, it is obvious that the low cost segment of the currently available commercial LiDARs is inadequate for most use-cases. On the other hand, the survey level LiDARs are prohibitively expensive and mostly not designed for harsh conditions, for example, constant motion and extreme weather. Most importantly, the long range survey LiDARs tend to have lasers exceeding the eye-safety limits, making their use questionable. For larger vessel, a combination of low cost and higher end units could be considered. It is also likely that advances in LiDAR technology will improve the range and resolution in the near future, allowing cost-performance class suitable for maritime use. Testing LiDAR suitability and performance in maritime environment could be done using low cost sensors and then scaled up when the technology advances. We have listed in Table VI commercially available mobile laser scanning LiDAR units that could be utilized for preliminary evaluation purposes.
D. Visual Sensors
By visual sensors, we mean all sensors which capture at least a two-dimensional image, one similar to a human eye. These include RGB (Red-Green-Blue), monochrome, and infrared digital cameras. Digital cameras can be used for positioning, ranging, and object detection and classification, all of which are essential tasks for an autonomous system. However, we note that in open sea conditions, there are no landmarks to perform camera-based positioning.
Measuring from images, that is, photogrammetry, has long been used in surveying and various industries [29]. For situational awareness, the optical system needs to be designed to have a suitable effective range, which is determined through the concept of (ground) sampling distance. That is, the distance between two pixels measured on the surface of the target. For example, a sampling distance of 0.5 m at 1 km range would result into a ship with a beam of 30 m showing on the image to have a 60 pixels wide front. This sets limits for precision in ranging and in object classification.
Ranging with cameras is usually done with a stereo camera setup, where the two cameras used and the target form a triangle. Otherwise, monocular distance estimation methods exploit the camera movement, such as in [30], but these methods can only estimate the distance to static targets. More recently, methods which do not require camera movement have started to emerge, such as [31], [32].
1) Ranging With Stereo Cameras:
A stereo camera consists of two (monocular) cameras. These cameras need to take images at the same time (time synchronization) and the relative position of the two cameras with respect to one another must be known. This setup allows ranging by triangulation. In [33], object detection in a maritime environment is achieved at a range of 500 m. However, it is likely that much longer distances are accessible with suitable optics and methodology.
The range estimation error \begin{equation*}\Delta Z = \frac {Z^{2}}{f B} \Delta D.\end{equation*}
Limitations include the robustness (and possibly the frequency) of the calibration of the stereo rig. Although stereo vision is typically done using higher resolution cameras, that is, RGB or monochrome cameras, the same techniques directly apply also for infrared cameras [35].
2) Digital Cameras for Maritime Use:
A short summary of the comparison between different camera types is shown in Table VII. There are so many COTS systems that we do not list them here but rather discuss their different types.
Monochrome and color cameras have their own strengths and weaknesses, and the choice depends on the application. To obtain color images, cameras employ a color filter array and infrared cut filter. This will both reduce the number of photons and limit the wavelengths that can reach the sensor. This makes monochrome cameras better in low light situations and in those situations where color information is not needed. Hence, in ideal maritime conditions, color cameras are better suited for detection and classification of objects, but in non-ideal conditions monochrome cameras are likely to be better for that. In fact, the two camera types can be seen as complementary to each other, and many sensor setups utilize both of these. For example, the autonomous car sensor setups of [36] and [37] utilize both monochrome and color cameras. Limitations of RGB and monochrome cameras include that they are reliant on good visibility, and therefore they are heavily affected by rain, fog, darkness, and other phenomena affecting visibility. However, it should be noted that during the night most surface vessels use lights, and these can be seen on images.
Infrared (IR) cameras sense thermal radiation. This makes them an interesting choice in maritime conditions because most objects of interest, such as ships and humans, have very different temperature compared to water. Therefore, these objects will generally be clearly visible on the images, also during the night. However, IR cameras have a number of disadvantages as well. Currently they are much more expensive and have quite poor resolution compared to cameras operating on visible light, which limits the precision of ranging and object detection and classification. In spite of this, IR cameras have already seen use in maritime conditions, for example in [38]. If the technology improves further and makes IR cameras more accessible, then they could become a core sensor in autonomous vessels.
Fusing cameras with RADAR and AIS, such as in [39], [40], appears as an attractive goal. The cameras need a protective casing or they need to be placed inside the ship, for example on the bridge. Especially in large ships, many windows have hydrophobic coating, which allows cameras to have good visibility even from inside the bridge. However, most infrared wavelengths do not penetrate window glass, and therefore infrared cameras should be placed outside and be rugged.
E. Audio Sensors
Microphones, and especially microphone arrays, have the potential to provide valuable information for context awareness in maritime applications. Different types of vessels could be detected, classified, localized and tracked by analyzing the sounds that they produce, such as those from motors (e.g., propulsion, ventilation, cranes) and whistles. Also, the automatic detection of certain events (e.g., fault detection, an object/person falling into the water) might be possible by analyzing sounds. In some cases, the sound analysis can produce new relevant information that enhances the situational picture, such as the position of small boats not equipped with AIS. In others, the information retrieved might be similar to that obtained from other sensors, such as AIS messages. In those cases, the redundancy can be used for cross-validation and to trigger different levels of alarm. This potential makes microphone arrays to deserve consideration for further study in relation to maritime context awareness applications.
1) General Considerations for Maritime Applications:
Regarding context awareness in maritime applications, and from a general, high level functional perspective, there are several criteria that one has to consider when selecting a suitable microphone to be used on its own or as part of a microphone array. First, the microphone has to be placed outdoors. Thus, it obviously has to withstand harsh weather conditions and other severities associated to long term outdoor maritime settings, such as wind, rain, snow, temperature variations, exposure to sun, and salty water. Second, its general performance must be well defined, stable and predictable over time. This is especially critical in microphone arrays, where the performance of all the array elements shall match. Third, it shall be optimized to cope with and minimize the effects of the different noise sources that one can expect from the context, such as the wind, sea, rain and noises induced by the ship itself.
We have not found microphones explicitly designed for maritime applications, nor scientific literature regarding maritime experimental settings from which one can take recommendations. There are, however, commercially available measurement microphones designed for long term outdoor use, and thus we consider these as the most suitable ones for maritime applications. Also, the outcomes of different relevant disciplines such as acoustic sound localization, acoustic environmental classification and/or outdoor acoustic signal acquisition, can provide valuable lessons directly applicable in a maritime context.
Outdoor acoustic signal acquisition is a rather mature field, and its evolution has brought more stable microphones in harsh weather conditions over time, as well as better wind and water protection systems such as grids, foams and fur. However, these protections do not completely remove the effect of environmental factors, and some of them can still greatly affect the performance of the sound registration and analysis. Particularly important is the wind-induced noise produced by the interaction of the wind with the elements surrounding the microphone and the microphone itself, including the windscreen, as it can prevent the extraction of useful contextual information [41], [42]. Clearly, the microphones need to have state-of-the-art windshields, but these cannot fully compensate for the wind effects. Fortunately, different signal processing techniques can be used to isolate and mitigate them [43], [44], [41], [45]. Techniques using microphone arrays are especially effective [44], [46].
Other questions pertaining the microphone placement are to be considered. Higher positions (e.g., settings in masts on the uppermost deck) will typically reduce the vessel self-produced noise (including sea-induced) and reflections, and will increase the visibility angle. On the other hand, they will increase the exposure to wind. A trade-off is then to be made, for which a preliminary field study might be useful.
2) Microphone Selection:
The task of a microphone is to convert the oscillations of the air pressure as they arrive to the sensor into electrical signals. There are different physical principles that can be exploited for this conversion, and depending on which one is exploited, we can find carbon, magnetic/dynamic, condenser, piezoelectric and optical microphones, each with their own characteristics that makes them more suitable for different applications. The most common ones are the condenser and dynamic types, in that order. Table VIII presents the main parameters that describe the performance of a microphone. Measurement microphones are different than ordinary microphones in that they are optimized for one or several of these parameters, depending on the application, and for stability over time. Measurement microphones for environmental monitoring are generally, if not invariably, of the condenser type.
There are different configuration aspects that need to be decided when choosing a condenser measurement microphone based on the particularities of the application and signal that is to be captured.
The diaphragm diameter mostly affects the frequency response. 1/2” ones have good general purpose characteristics.
The microphones themselves affect the sound that they are capturing. There are three types of microphones optimized to minimize their effect in different measurement: free field, pressure and random incidence. Free field ones are optimized for sounds coming mostly from one direction.
The capacitor of the diaphragm needs polarization. There are externally polarized microphones (require 200 V power line, and usually leads to more expensive settings) and prepolarized ones (use simpler cables, e.g. coaxial, typically leading to lower cost settings).
The microphone’s sensitivity is a good indicator of its health, and thus tracking its sensitivity is the best way to assess the microphone’s stability. Measurement microphones, specially for outdoor use, require periodic checking and calibration, for example, on-site every three months and every 18 months in a certified laboratory, depending on the application and working conditions.
The most typical use of outdoor measurement microphones is noise level monitoring in urban scenarios and airports. The goals are typically to understand the underlying phenomena, asses the effectiveness of actions against noise and/or measure the noise pressure or intensity so that complaints can be verified with an accuracy that allows for law enforcement. Other uses with closer requirements to context awareness in maritime scenarios are those targeting source localization and context recognition using microphone arrays. As an example of a commercially available solution, Rion’s Aircraft Noise Monitoring System [47] uses a four element microphone array mounted in the same mast to detect and estimate the direction of arrival of the aircraft’s sound. It also identifies the aircraft from its transponder data, and associates the noise measurements to specific airplanes. It can also classify the activity of the aircraft (e.g., landing, taxiing, take off, and engine testing) based on the analysis of the sound. Other close examples of available solutions are real-time acoustic gunshot detectors. These estimate the position of the origin of a shot by analyzing the shock-waves and muzzle blasts produced by the projectile and received by an array of sensors/microphones. These can be based on microphone arrays mounted in the same mast [48], [49] as well as distributed across large distances [50].
Finally, Table IX presents examples of outdoor measurement microphones that can be found in the market together with some of their main characteristics.
F. AIS Receivers
Automatic Identification System (AIS) is a Very High Frequency (VHF) system used for broadcasting the location of vessels, fixed and floating Aids to Navigation (AtoN), or other obstacles in the sea such as oil platforms and wind farms [51]. A vessel equipped with an AIS receiver will be able to locate these objects in its vicinity irrespective of the visibility conditions or if the nearby vessel is approaching from non-line-of-sight in inland or archipelago waterways [52]. In addition to the location, AIS messages may also contain the vessel’s dynamic information, static information, and voyage related information. Most commercial vessels will broadcast their own information via AIS messages. Such vessels therefore carry a transponder (also called transceiver) capable of both transmission and reception of AIS messages.
Two VHF channels are used for the communication, called AIS1, or channel 87B (161.975 MHz) and AIS2, or channel 88B (162.025 MHz) [53]. Therefore, AIS receivers can be either single- or dual-channel. The benefit of dual-channel is that it will display more information, complete messages, and more frequently updated information than a single channel receiver [54]. A basic AIS receiver assembly includes the receiver module, a VHF antenna with cables, a laptop with software to record and interpret the message streams, and a separate power source if the assembly is not powered via the laptop.
COTS AIS devices offer a wide variety of make and models which can be compared based on the supported frequency channels (single- or dual-), spatial range (in Nautical Miles (NM)), format for the output data (e.g. NMEA0183) and connection (USB, or other), power supply type, and cost. Some example state-of-the-art commercial AIS receivers and transponders are compared in Table X.
G. Public Maritime Datasets
A-priori recorded and archived data accessible via the internet is another source of situational awareness information to future autonomous vessels. This data can also be used for training the machine learning algorithms and as supplementary information to verify the correctness and completeness of real-time situational awareness provided by the on-board sensor assembly - in effect monitoring the reliability of the deployed sensor system. For instance, the position of a navigation aid detected along the vessel path can be cross-checked by referring to the public register of aids to navigation, if one is available from the local maritime authority. Table XI presents some examples of public maritime datasets relevant to the Baltic Sea.
Review of AI Techniques for Autonomous Vessels
The primary focus of this section is to provide an overview of the state-of-the-art in artificial intelligence methods that are relevant for autonomous vessels. Such methods are characterised by their application area (e.g., traditional or autonomous navigation, navigation phase), by their data requirements, by computational complexity (e.g., online or offline), and by robustness. We, on the other hand, are interested in problems and challenges in maritime scenarios, for example, situational abnormality detection, vessel classification, and localization. Hence, our attempt is to connect the operational requirements of Section III onto (well-defined) problems that may be solved with AI methods.
AI is a broad concept. John McCarthy referred to it as “the science and engineering of making intelligent machines, especially intelligent computer programs” [63]. The term AI in this context mainly refers to machine learning methodology that is used for regression or classification problems [64], [65]. Popular examples are the Deep Learning (DL) and Gaussian Processes (GPs) [66], [67]. They have already been used successfully in the maritime domain, for example, in detecting and classifying ships from images (see e.g., Table XII) and analyzing the navigational behavior of observed ships (see, e.g., Table XIII). These tables are discussed further in Section V-C and V-E.
A. Key Requirements of AI for Maritime Problems
As stated previously, safety is the key in autonomous maritime systems, and thus the algorithms need to be robust in diverse operational situations. We understand the robustness of AI as the generalisation capability of AI methods, which is gradually becoming more important in this field [64]. The Probably Approximately Correct (PAC) framework states that we could achieve better generalisation performance of a certain model with larger training dataset [68]. However, it requires the algorithms to be able to examine large-scale data, which might be problematic in many cases [65], [69], [70]. A dataset is considered large-scale when the dimensionality of data record or the number of records is large. For example, many medical images (e.g., retinopathy and histology) are considered as high-dimensional data, as they usually contain millions of pixels. Deep learning methods [66] are known to be good at dealing with very large amount of data if enough computational resources are provided, but still behave poorly for data with large dimensions. The increasing of dimensions leads to a huge growth of trainable parameters for deep learning models. On the other hand, Gaussian Processes (GPs) [67] are not usually limited by the dimension of data, if the covariance function is suitable. However, GPs have problems dealing with large amounts of data, as the computational complexity scales cubically with number of data records [67]. Therefore, in practice, the state space modeling of GP is often used for linear computational complexity on temporal data [71]–[73].
In maritime scenarios, the type of data can vary a lot, which requires us to use different AI methods for different types of data. For example, the data of AIS or GNSS are typically low-dimensional with the number of observations accumulating typically at 1 Hz rate. With time, these data may become large. In this case, deep learning models, especially recurrent neural networks [74], might have better capabilities for the analysis of a ship’s trajectory. For the maritime audio signals, the number of dimensions is large. For example, an audio piece with 44.1 kHz and 1 s length has 44100 dimensions. Instead, we could use, for example, Short Time Fourier Transform (STFT) to transform the audio signal into spectral domain and use the spectro-temporal representation (image) as data features [75], [76]. RGB or RGB-depth9 (RGB-D) images from camera(s) are challenging both in data dimension and quantity, however deep learning especially deep convolutional neural networks [77], have successfully been employed for such type of data.
For deep learning methods, the predictive distribution is not easy to obtain, because complicated hierarchy neural network functions are involved when taking the integral to obtain the posterior. However, in many cases, the Gaussian processes method can give the closed-form solution to the predictive distribution [67].
Another requirement for machine learning algorithms is the capability of online and offline learning. As shown in Figure 6, the computation might need to be done in real-time or offline depending on different situations. For example, while learning to classify the objects can be done offline, when we localise the maritime objects from camera or microphones the estimation has to be done in real-time. The online and offline characteristics define the way of learning from data and making prediction of a machine learning algorithm [78].
The system architecture for batch, online, and robust sensor processing. The batch training phase is typically done outside the vessel (e.g. in a computation cloud). The online learning and sensor fusion is done inside the vessel.
Online learning methods learn from sensor data on the fly, for example Bayesian state-estimation methods [79] and online probabilistic machine learning methods [65], [72]. These methods are especially suited for learning dynamics and prediction models. While offline learning, which we refer to as batch supervised machine learning type of methods, requires predefined training data. They learn from the entire training dataset at once, and do not utilize any new data for refining the learnt model. Deep learning [66] and probabilistic machine learning [65] are typical offline machine learning methods. These offline methods are especially applicable to the identification and classification tasks in image and sound analysis.
Based on the above requirements, the selected machine learning algorithms need to take into account the following aspects: scale of data, uncertainty of prediction, and need for online learning.
B. AI for Maritime Self-Situational Awareness
The main task in achieving total situational awareness of a vessel is to present a safety and abnormality level analysis. We mainly refer the abnormality level as the uncertainty of object identification and localisation. As shown in Figure 7, such information can be delivered by utilizing AI methods and fusing the data from maritime sensors, such as, AIS/GNSS, images, and audio signals listed earlier in this article. If the attributes from detection and classification matches the meta data from AIS messages for example, we may consider a positive situational awareness to have been achieved. This forms a basis towards the next stage, which is to track the identified neighbouring objects and compute the probability of collision using predicted trajectories.
Experiments of vision-based ship recognition conducted on board MS Megastar. The rectangle box gives the location and label of detection. The number in parenthesis is the confidence of detection. The images are challenging to detect, as they have low brightness and ill exposure problems.
The application areas of AI methods in maritime navigation and vessel situational awareness are identified as object identification, localization, and trajectory analysis. We will also focus on the state of the art review for these tasks, and especially on the use of deep learning and Gaussian processes in them. The viability of such sub-task division and focusing on deep learning and Gaussian processes for the total maritime situational awareness systems is demonstrated by an industrial study in 2017 [12], where they propose to use deep learning and different sensor fusion to give vessel situational awareness. Other related demonstrative studies can also be found in, for example, [11], [13].
C. Maritime Object Detection and Classification
The term “objects” here refers to anything that manifests on the maritime landscape and is distinguishable from the background, for example, ships, sea birds, and motor boats. The aim is to detect or classify the objects in the sensor range of a vessel. Such tasks are usually performed using one or multiple cameras due to the advances of image processing techniques and implementation simplicity. Audio data can also be applied here for this purpose. Interestingly, we observed that it is actually very rarely studied for maritime applications. We argue that, due to the large proportion of background noise from the environment (e.g., rain, wind, sea waves, bird sounds, humans shouting, and own engine), using sound for the detection and classification of maritime objects would require additional signal conditioning. The main challenge of using image data for object detection is that the dimension of data is quite large.
Using classical CNNs directly is non-trivial, because the classical CNNs are designed for image classification, not for detecting/classifying multiple objects from an image. In practice, small sliding window is applied to recursively search all area of a large image. The drawback is that it costs vast amount of computational efforts, and the size of sliding window must be determined beforehand. This is not always possible in the maritime scenario, because the size of objects in an image varies according to the distance. The Region Proposal Network (RPN) [80] is a potential candidate solving this problem. As shown in Figure 9, the main difference between a classical CNN and RPN is the region proposal layer, which gives potential area of object manifestation. This significantly reduces the search area for objects in an image. It is particularly useful when objects (such as a small boat) cover just a very small amount of area in terms of pixels. A list of recent related studies on maritime objects detection and classification using different sensors are listed in Table XII.
A general framework for ship detection using RPNs. Ship image is taken from Helsinki–Tallinn line by authors.
In addition, to show the effectiveness of CNN based vision detector for ships, we conducted an experiment on MS Megastar operating from Tallinn to Helinski at January 20th, 2020. We obtained hundreds of ship images in resolution
D. Maritime Object Localisation by Sound
The benefits of using audio data for localisation is that they can be omnidirectional with proper setting of mic array, and the localisation algorithms are feasible once proper care is taken to eliminate background noise. One of the current state-of-the-art binaural sound localisation framework is called Head-Related Transfer Function (HRTF) [105], [106]. The idea is to recover how human ears receive and perceive sound, and treat the human way of localising sound as a transfer function.
Instead of hand-crafting features for Sound Source Localisation (SSL), deep neural network can achieve it in an end-to-end way. The key insight is how to formulate the SSL problem in neural network regression/classification and how to design or choose a suitable architecture. To our knowledge, the earliest work is from 1996 by [107], where the authors put the differences in intensity and phase from the inputs to a three-layer neural network. Further studies [108]–[110] binarize the data lables rather than using real-valued data, as shown in Figure 10. Several studies have been conducted on wind noise reduction [109]–[117]. However, for a maritime scenario it is still an open problem. The sound in the maritime environment also has a tendency to travel with the wind, which may disturb the procedure of object localisation through sound data. Fortunately, the information about the strength and direction of the wind might be available on board (e.g., weather forecast and wind sensor), which can be included into the end-to-end training as a-priori knowledge to eliminate the wind problems.
Sound source localisation using neural networks. The idea is to encode the sound-source location (label) in few-hots binarized vector, where “1” represents sound manifestation.
E. Maritime Trajectory Analysis
Ship trajectories are part of the measured data used in situational awareness systems. Two of the most common sources of trajectory information are the ship’s AIS transponder and the ship’s own GNSS receiver. Through the GNSS receiver, the position of that vessel can obtained. This position is then communicated to other ships via the AIS transponder while obtaining data that contains the other ships’ positions. AIS systems are widely installed on ships for safety purposes, but for other benefits these data have been recorded by ground stations into numerous publicly accessible datasets [118]. Related to the use of this data, the recent studies on maritime trajectory analysis are listed in Table XIII.
Conclusion
This study about state-of-the-art in situational awareness for autonomous vessels is divided into three core areas: a consolidation of operational requirements from background literature, a review of relevant and available commercial sensors, and a review of applicable Artificial Intelligence (AI) algorithms for the sensor fusion.
A review of the relevant existing regulations and standards for autonomous vessels, as well as global efforts in this domain reveals that such activities involve several key industry players, National and international agencies, and regulators working in collaboration. Although IMO is a key driver towards standardizing operational requirements in this domain, there exist few other references which recommend additional requirements at the hardware, software, and performance levels. This article consolidates the essential requirements. In general, we conclude that positioning requirements for autonomous vessels are in line with the requirements for dynamic positioning of ships. This means that the ship shall always have two independent position reference sources to ensure reliability and integrity; have a positioning error of less than 3m; and be able to provide sensor system integrity information.
The paper considers a hardware sensor assembly designed around four sensor families: visual sensors, audio microphones, positioning receiver, and RADAR/LiDAR. This set of sensors was chosen in view of simplicity and brevity of the text, although this in no way restricts the use of additional sensors (some of which are listed in the paper). The paper describes state of the art in these sensors including recommendations on the most relevant requirements for each of them. The sensor assembly should provide an autonomous vessel with holistic situational awareness. This includes at minimum information about its own position and of objects and other vessels in its path.
The primary purpose of AI here is to fuse the sensor data resulting in vessel localization and situational awareness as well as monitoring the integrity of the sensor assembly. For this, deep learning and Gaussian processes are the state of the art. These two methods are ready for industrial deployment in the maritime scenario, and their performance in object classification, regression, and localisation problems is promising. The combination of deep learning and Gaussian processes with the sensor data has the potential to solve the maritime situational awareness tasks in future autonomous vessels.