Loading [MathJax]/extensions/MathMenu.js
Google Earth Engine Cloud Computing Platform for Remote Sensing Big Data Applications: A Comprehensive Review | IEEE Journals & Magazine | IEEE Xplore

Google Earth Engine Cloud Computing Platform for Remote Sensing Big Data Applications: A Comprehensive Review


Abstract:

Remote sensing (RS) systems have been collecting massive volumes of datasets for decades, managing and analyzing of which are not practical using common software packages...Show More
Topic: Cloud Computing in Google Earth Engine for Remote Sensing

Abstract:

Remote sensing (RS) systems have been collecting massive volumes of datasets for decades, managing and analyzing of which are not practical using common software packages and desktop computing resources. In this regard, Google has developed a cloud computing platform, called Google Earth Engine (GEE), to effectively address the challenges of big data analysis. In particular, this platform facilitates processing big geo data over large areas and monitoring the environment for long periods of time. Although this platform was launched in 2010 and has proved its high potential for different applications, it has not been fully investigated and utilized for RS applications until recent years. Therefore, this study aims to comprehensively explore different aspects of the GEE platform, including its datasets, functions, advantages/limitations, and various applications. For this purpose, 450 journal articles published in 150 journals between January 2010 and May 2020 were studied. It was observed that Landsat and Sentinel datasets were extensively utilized by GEE users. Moreover, supervised machine learning algorithms, such as Random Forest, were more widely applied to image classification tasks. GEE has also been employed in a broad range of applications, such as Land Cover/land Use classification, hydrology, urban planning, natural disaster, climate analyses, and image processing. It was generally observed that the number of GEE publications have significantly increased during the past few years, and it is expected that GEE will be utilized by more users from different fields to resolve their big data processing challenges.
Topic: Cloud Computing in Google Earth Engine for Remote Sensing
Page(s): 5326 - 5350
Date of Publication: 01 September 2020

ISSN Information:


CCBY - IEEE is not the copyright holder of this material. Please follow the instructions via https://creativecommons.org/licenses/by/4.0/ to obtain full-text articles and stipulations in the API documentation.
SECTION I.

Introduction

In Recent years, there has been a significant increase in the number of remote sensing (RS) datasets acquired by various spaceborne and airborne sensors with different characteristics (e.g., spectral, spatial, temporal, and radiometric resolutions) [1]. This trend is expected to continue due to the availability of more open-access RS datasets and daily advancement in sensor, image processing, and computer vision technologies [2].

Working with petabytes of RS datasets is a challenging task and has its own special requirements. The challenges of big data processing and analyzing can be divided into two categories: common and individual facets [3]. The common challenges are more related to handling big data and include big data computing, big data collaboration, and big data methodologies. The individual challenges are related to big data life cycle in different applications, such as the appropriate data identification, data deployment, data representation, data fusion, as well as data visualization and interpretation. In order to provide a comprehensive solution that can meet a wide range of current and future challenges and requirements in RS applications, one of the most important steps is to develop a safe, efficient, and advanced cloud computing platform [3], [4].

Cloud computing platforms are efficient ways of storing, accessing, and analyzing datasets on very powerful servers, which virtualize supercomputers for the user. These systems provide infrastructure, platform, storage services, and software packages in a variety of ways for the customers [3], [4]. Several cloud computing platforms have so far been developed. For example, Amazon Web Services (AWS) is a pay-as-you-go platform, where users pay based on the hours that they use the services [2]. AWS has a dedicated cloud Earth Observation (EO) offering called “Earth on AWS” as part of its Public Dataset Program, which includes open data from several satellites such as Landsat-8, Sentinel-1, Sentinel-2, China–Brazil Earth Resources Satellite program, National Oceanographic, and Atmospheric Administration Advanced (NOAA) image datasets, as well as global model outputs. AWS also hosts open data supplied by DigitalGlobe with its SpaceNet challenges. Moreover, AWS hosts the largest suite of machine learning services [4]. Azure is another cloud computing platform hosted by Microsoft. This platform has established the Artificial Intelligence (AI) for earth initiative to facilitate the use of its AI tools for addressing environmental challenges in four main areas of climate, agriculture, biodiversity, and water. Azure only contains Landsat and Sentinel-2 products for North America, since 2013, as well as moderate resolution imaging spectroradiometer (MODIS) imagery. Azure is also a pay-as-you-go platform which provides virtual systems for the users [5].

Google Earth Engine (GEE) is another cloud computing platform which was launched by Google, in 2010. GEE uses Google's computational infrastructure and available open-access RS datasets [6]. GEE is the most popular big geo data processing platform, facilitating the scientific discovery process by providing users with free access to numerous remotely sensed datasets [1], [2]. Users can access GEE via an internet-based Application Programming Interface (API) and a web-based Interactive Development Environment [2], [6]. Additionally, users do not need to have expertise in web programming or HyperText Markup Language to use GEE for different applications [6]. GEE has the features of an automatic parallel processing and fast computational platform to effectively deal with the challenges of big data processing [6], [7]. For instance, according to Hansen et al. [8], it only took 100 h to process 654 178 Landsat-7 images (about 707 terabytes) within GEE and produce a global map of forests. This was reported as a great achievement because if they did not use GEE, this process would have taken a million hours to complete. Furthermore, users do not need to download the available dataset within GEE in order to use them or install any software to perform the processing tasks existing in GEE. However, GEE users can utilize complementary software packages or process their own private datasets within this platform. This platform also contains various built-in algorithms, such as classification algorithms, to analyze data at a planetary scale and also helps scientists to develop their own algorithms with less effort than before [1], [2], [9].

As discussed, the remarkable capabilities of GEE provide unprecedented opportunities to employ this platform for big data processing and interpretation and, therefore, it is effectively employed in a broad variety of disciplines in all branches of Earth science studies. It is also expected that users will more frequently use this cloud computing service considering the trends of GEE studies in recent years. There are currently four GEE literature review studies conducted by Gorelick et al. [6], Kumar and Mutanga [9], Mutanga and Kumar [10], and Tamiminia et al. [2], published between 2017 and 2020, respectively. Gorelick et al. [6] was the first comprehensive GEE review paper conducted by the main GEE developers. The authors comprehensively discussed different aspects of GEE, including data catalog, system architecture, functions, data distribution models, efficiency, along with several applications and challenges. Kumar and Mutanga [9] also briefly discussed the publication and authorship trends, datasets, study areas, and applications of GEE by reviewing 300 journal papers. Furthermore, Mutanga and Kumar [10] briefly discussed four main applications of GEE. More recently, Tamiminia et al. [2] also discussed various aspects of GEE by reviewing 349 journal papers. The authors provided comprehensive information about the GEE publications based on study areas, number of publications, datasets and products, functions, sensor type and resolutions, classification accuracies, and various applications.

There is still need for a more comprehensive review to discuss various aspects of the GEE platform. Therefore, in this study, 450 journal articles along with peer-reviewed conference papers were investigated through eight main sections: Section I provides an introduction to GEE; Section II provides an overview of the GEE platform; Section III presents different datasets included in this platform; Section IV discusses various GEE functions and algorithms; Section V provides comprehensive information about the advantages and limitations of GEE; Section VI analyzes the pattern of GEE publications over one decade; Section VII discusses different applications of GEE; and finally Section VIII provides several case studies, in which GEE was applied to process and analyze big data over large areas and within a long period of time.

SECTION II.

GEE Platform Overview

GEE is mainly composed of the following three platforms:

  1. Earth Engine (EE) Explorer;

  2. EE Code Editor;

  3. EE Timelapse.

The details of each platform are discussed in the following sections.

A. EE Explorer

EE Explorer (see Fig. 1) is a data viewer platform which allows users to access the massive datasets available in the EE Data Catalog. The Data Catalog houses millions of publicly available datasets, including a complete series of Landsat-4, -5, -7, and -8, MODIS, Sentinel-1, -2, -3, and -5P imagery, as well as several atmospheric, meteorological, and vector datasets, which will be further discussed in Section III. The Data Catalog receives approximately 4000 new datasets every day [11].

Fig. 1. - Earth Engine Explorer platform. (a) Workspace. (b) Data Catalog.
Fig. 1.

Earth Engine Explorer platform. (a) Workspace. (b) Data Catalog.

As illustrated in Fig. 1, the EE Explorer is composed of the Workspace [see Fig. 1(a)] and the Data Catalog [see Fig. 1 (b)]. In the Data Catalog, users can search among massive datasets and import them to the Workspace. In the Workspace, users can manage and visualize datasets. The Workspace also enables users for a quick view, zoom, and pan. Additionally, it allows users to set parameters related to the visualization setting, such as contrast, brightness, and opacity levels. To better inspect any changes over time, users can add multiple layers to the Workspace. Users can display the layers in a three-band RGB or a single-band grayscale/pseudocolor representation [6]. For example, Fig. 1(a) demonstrates a true color composite of a MODIS bidirectional reflectance distribution function (BRDF)-adjusted image.

B. EE Code Editor

While the EE Explorer platform is designed to visualize datasets, the EE Code Editor (see Fig. 2) is designated to process big data using a JavaScript programming language and to develop EE applications. According to Fig. 2, the EE Code Editor is composed of the following elements: Code editor, Map, Layer manager, Geometry tools, and several tabs, including Script, Doc, Assets, Inspector, Console, and Tasks.

Fig. 2. - Overview of the Earth Engine Code Editor.
Fig. 2.

Overview of the Earth Engine Code Editor.

The central panel allows users to write their JavaScript code. GEE processes the written codes and illustrates the results as images in the Map panel or as messages in the Console Tab. Similar to the EE Explorer, users can set the visualization parameters via the Layer manager in the Code Editor (see Fig. 2). In the Script tab, numerous examples of scripts facilitate developing applications. There are more than 800 prebuilt functions (discussed in detail in Section IV) in the EE library, users can become familiar with them using the Doc tab, providing API reference documentation [6].

As previously mentioned, GEE includes big open-access datasets. Users, however, are not restricted to use only these datasets. They can upload and manage their own data using the Asset tab. It is also possible to interactively query the map using the Inspector tab. Finally, the Geometry tools allow users to draw geometric features, such as points, lines, and polygons, which can be used in further analyses [6].

C. EE Time-Lapse

GEE combines petabytes of RS datasets over four decades and produces a global, zoomable, and cloud-free video over space and time in its EE Time-laps platform [6]. The Timelapse platform is an example of the great computational power of the GEE platform. This platform provides the most comprehensive picture of the Earth revealing how its residents are treating it. For instance, through EE Time-lapse, one can easily observe the fast retreat of Mendenhall Glacier in Alaska, decapitation of West Virginia Mountains by the mining industry, forest loss in the Amazon, and drying Urmia lake in Iran over time.

SECTION III.

GEE Datasets

As discussed, GEE contains an immense number of datasets, including raw datasets, preprocessed data, elevation models, and products at global, national, and regional extents. Table IV in the Appendix provides all available datasets within GEE along with a brief description of each. Some of these datasets, which are frequently utilized by users are discussed in more detail in the following.

Landsat datasets are valuable resources to perform temporal analysis. Landsat collection includes seven multispectral satellites: Landsat 1–3 (1972–1983), Landsat-4 (1982–1993), Landsat-5 (1984–2012), Landsat-7 (1999–present), and Landsat-8 (2013–present). Landsat satellites have optical sensors, the images of which may be obscured by clouds. Therefore, temporal cloud detecting, masking, and removing are essential preprocessing steps in different applications, such as image classifications using multitemporal imagery [12]. Additionally, the availability of the multitemporal Landsat datasets has facilitated national and global scale analysis [13]. Landsat-based datasets within GEE have been employed in various applications. For instance, Landsat data available in GEE have been widely utilized in generating Land Cover/Land Use (LCLU) maps (e.g., [14]–​[16]). Moreover, urban detection and extraction is an important task in the economic investigation due to rapid population growth. Therefore, several studies have utilized Landsat data in urban monitoring [17], [18].

GEE includes datasets acquired by Sentinel satellites, developed by the European Space Agency (ESA). Sentinel collection includes Sentinel-1 Synthetic Aperture RADAR (SAR) (2014–present), Sentinel-2 multispectral (2015–present), Sentinel-3 Ocean and Land Color (2016–present), and Sentinel-5P Tropospheric Monitoring (2018–present) datasets. Sentinel-1 and Sentinel-2 have been extensively utilized by GEE users for different applications. Their 10 m spatial resolution makes it possible to analyze objects in a better resolution compared to Landsat images. They can also simplify the procedure of training and validation steps in image classification tasks. Mandal et al. [19] applied Sentinel-1 SAR data to map rice and monitor its temporal changes. Additionally, Traganos et al. [20] estimated satellite-derived bathymetry (SDB) of three regions in the Aegean Sea using Sentinel-2 time-series analysis.

GEE includes MODIS images. MODIS has a great potential in near-real-time (NRT) mapping of the ground surface in national and global scales. MODIS acquires images in 36 spectral bands, the spatial resolutions of which vary from 250 m to 1 km. MODIS time series are available in GEE Data Catalog from 2000 to present, facilitating temporal analysis over globe. Campos-Taberner et al. [21] developed a temporal investigation on MODIS-based indices, including the global Leaf Area Index, Canopy water content, Fraction Vegetation Cover, and Fraction of Absorbed Photosynthetically Active Radiation.

SECTION IV.

GEE Functions

GEE provides various functions to perform spectral and spatial operations on either a single image or a batch of images. Different operations within the GEE platform, ranging from simple mathematical operations to advanced image processing and machine learning algorithms are illustrated in Fig. 3. Various pixel-based spectral operations, which have high potential to be implemented in parallel on cloud architecture, are included in GEE. However, GEE supports fewer spatial functions, such as Gaussian and Laplacian filters, edge detection methods (e.g., Sobel, Roberts, and Canny), line detection via the Hough Transform, and morphological operators (e.g., dilation and erosion) due to parallel implementation issues. Moreover, GEE currently does not support several functions, including frequency domain algorithms (e.g., FFT and Wavelet), hierarchical algorithms (e.g., hierarchical clustering), graph-based methods (e.g., graphcut), geometric descriptors (e.g., Haar, SIFT, SURF), and physical-based models (e.g., radiative transfer models).

Fig. 3. - Overview of different supporting functions within GEE.
Fig. 3.

Overview of different supporting functions within GEE.

Both supervised and unsupervised machine learning algorithms are accessible through the GEE library. For example, the classification and regression tree (CART), support vector machine (SVM), and random forest (RF) classifiers are among the supervised classification algorithms within GEE. Labeled samples are required in supervised classification methods to train the classifiers, for which both sampling and training functions are available in GEE. There are also many clustering algorithms in GEE, such as K-means. K-means is a popular clustering method in the data mining area. The algorithm requires users to define the number of clusters (K) and the stopping criteria [22]–​[24]. Besides the original K-means, two modified versions of K-means (i.e., Cascade K-means [25] and X-means [26]), in which the number of clusters is estimated automatically, are available in GEE. Cobweb is another clustering algorithm which hierarchically handles data instances data instances. It constructs a classification tree and manages it through merging and splitting steps [27]. Simple noniterative clustering (SNIC) is another clustering-based segmentation method, which is initiated with randomly/manually determined seeds and generates segments [28]. SNIC is widely utilized by users to perform object-based image classifications (e.g., [29]).

As mentioned before, GEE contains over 40 years of datasets, facilitating temporal and change analyses. For temporal analysis purposes, several functions, such as continuous change detection and classification (CCDC) [30], exponentially weighted moving average change detection (EWMACD) [31], and Landsat-based detection of Trends (LandTrendr) [32] are available. CCDC fits harmonic functions to temporal data to detect points with significant variations. EWMACD calculates a model according to the training data. Then, the difference between the model and real data points are found according to the Shewhart X-bar charts and an exponentially weighted moving average. LandTrendr is specially designed for Landsat data and finds the pixel-based spectral change in temporal analysis. Vegetation analysis is also a popular subject in temporal analysis. Therefore, GEE has several algorithms, such as vegetation change tracker (VCT) [33] and vegetation regeneration and disturbance estimates through time (VERDET) [34], which are specifically developed for this purpose. VCT can automatically analyze Landsat time-series images to generate forest disturbance history. VERDET categorizes forest change into three types, including disturbed, stable, and regenerating. The analysis is based on the total variation regularization in the spatial and temporal domain [34].

SECTION V.

GEE Advantages and Limitations

GEE is a valuable tool in analyzing geospatial data that provides many capabilities for researchers, especially for the RS community. However, there are also several limitations that users should be aware of. The key advantages and limitations of GEE are summarized in Table I and discussed in more detail in the following section. As illustrated in Table I, the advantages and disadvantages of GEE are investigated within the four categories of cloud infrastructure, API, data, and functions.

TABLE I Main Advantages of GEE Big Geo Data Processing Platform
Table I- Main Advantages of GEE Big Geo Data Processing Platform

A. Advantages

1) Cloud Infrastructure

GEE is mainly a free cloud-based service without having to download and manage data locally [35]. It is built upon the Google cloud computing infrastructure and computations are automatically handled by Google itself. All operations are automatically performed in bulk and parallel on the Google CPUs and GPUs [6]. The complexities of parallel computing are hidden due to this automation in processes [17].

Since GEE was mainly created and optimized for geospatial data analysis, it can process petabyte of RS data both in large geographical scales and in long temporal coverages [17]. Thus, it is a great tool for analyzing regional, national, continental, and global-scale applications.

Besides various datasets, which are already available within GEE, researchers can easily upload and share their own datasets as well as their scripts and models through URLs [9]. Other maps and products are generated on-the-fly [28], [29], once any user wants to run the code [36], [37]. Additionally, there is no need to install third-party software packages, such as ENVI and ERDAS, because almost all of the required tools are already available on GEE [38].

GEE stores and analyzes RS imagery based on a pyramiding and tiling concept [39]. Every image ingested into GEE has its pyramid at different pixel resolutions [6]. Furthermore, every tool used in GEE processes images on 256×256 tiles. Thus, different scales of the pyramid are used at various zoom levels. This enables GEE to visualize large areas of processed imagery quickly and efficiently.

Fast filtering and sorting capabilities are provided within GEE, inherited from Google. This enables users to select their desired data out of millions of images based on various spatial and temporal specifications [40].

2) API

GEE is combined with a powerful web-based programming interface. Users can easily access archived RS data through the JavaScript and Python API. The straightforward concept of using both APIs allows users to focus on the logic of data selection and programmable workflow. Only a log-in is required to access all GEE power. An online code editor is also available to write scripts, debug them, and see the results just after compilation.

Both the JavaScript and Python APIs provide access to the same set of EE objects and methods, except for a few methods which are capitalized differently (e.g., .and() versus .And()) [41], [42].

Most of the libraries in GEE are similar to existing open source components, such as OpenCV, and GDAL. Therefore, there is a minimum requirement to learn new concepts.

The Python API provides a programmatic and flexible interface to EE [41], [43]. It allows for automating batch processing tasks, piping EE processed data to Python packages for postprocessing and leveraging the power of the command line. Additionally, the Jupyter notebook interface of the Google Colaboratory platform delivers a highly interactive and collaborative experience and is without the burden of local system setup and management as a hosted service. In summary, the EE code editor has a high ease of setup and use, while the Python API is more flexible. Combining GEE and Python APIs inside a Jupyter notebook provides the advantage of both to users.

In order to compare JavaScript with Python based on [43], it can be argued that JavaScript is easy to get started and share scripts, while it cannot share code between scripts. However, Python is easy to share code between scripts and is easier to be transformed into a web application. Moreover, Python has many plotting options, which requires several assembly and maintenance. Finally, the code editor enables the user to store, share, and control their codes in a behind-the-scene git environment.

3) Datasets

As discussed in Section III, GEE contains a large catalog of RS, geophysical, and meteorological datasets. It contains most of the important and temporal datasets in RS, including Landsat, MODIS, and Sentinel. Furthermore, the combination of different sources of imagery improves the temporal density of datasets and can help fusion algorithms to have more power. Moreover, several NRT datasets are uploaded to GEE in a daily manner. If a dataset is not in the GEE Data Catalog, it can also be uploaded to the servers. Datasets are also downloadable to continue from a desktop workstation at any point of the workflow.

GEE stores datasets in their original projection with all original data and metadata. Resolutions are managed directly by the platform. Data are stored in its original resolution, but a pyramid of images is also constructed and stored beside every image which is used in different zoom levels for the sake of efficiency. As mentioned, users can also easily search for their desired data using the tags provided within data categorization, which is very well handled in GEE.

Several preprocessing steps have been already applied to the datasets and, thus, users can use corrected data besides raw data. For instance, the orthorectified, atmospherically corrected, and Calibrated Top of Atmosphere Landsat data are easily accessible apart from the raw data [44], [45]. Analysis-ready SAR datasets on GEE represent a significant step forward because SAR preprocessing is relatively complex (especially for regular users). For example, GEE hosts Sentinel-1 GRD data preprocessed with ESA's SNAP software [46].

GEE makes many derivative products available. Multiple popular spectral indices (e.g., NDVI) are already calculated. Since storage is more expensive than computation, most of these derivative products are computed on-the-fly upon users’ request.

4) Functions

As discussed in Section IV, a large set of functions and algorithms are available within GEE library for analyzing various datasets. All algorithms are parallel in nature and can automatically handle data management over servers.

Machine learning, image processing, vector processing, geometrical analysis, different visualizations, and multiple specialized algorithms are gathered into the GEE platform and enable users to implement their idea. The GEE functions usually satisfy the needs of a typical scientific project. Additionally, users can always implement their own algorithms outside GEE and return the result for postprocessing. For instance, TensorFlow is a better option in the deep learning section, for which more complex models, larger training datasets, more input properties, or longer training times are required [47], [48]. TensorFlow models are developed, trained, and deployed outside EE [49]. For easier interoperability, the EE API provides methods to import/export data in TFRecord format [47]. This facilitates generating training and evaluation datasets in EE and exporting them to a format where they can be ingested to a TensorFlow model.

A complete API reference and tutorial with runnable code examples are available for beginner to advanced users (e.g., [47]). The tutorials are detailed and cross referenced to each other to guide users through different applications and important notes. Outputs of these algorithms can be directly embedded in different applications.

B. Limitations

GEE limitations are relatively minor, but it is essential to be familiar with the constraints. Several main limitations of GEE are discussed in the following.

  1. Although data is kept as private in the user's account, it is still stored in the servers of a private company, which is not acceptable for many governmental agencies and private companies [50].

  2. GEE-based image analysis is restricted to existing tools within the GEE API. For example, several standard image preprocessing methods (e.g., atmospheric correction techniques) are currently not implemented in GEE. Moreover, developing new tools is not trivial and requires knowledge about all GEE algorithms and their functionality along with performance considerations about cloud-based computing on Google servers.

  3. GEE is limited to selected data mining models for classification and regression. There are only a few classification and regression algorithms, such as CART, RF, and SVM.

  4. Image classification as one of the important applications of RS can be considerably improved by object-based image analysis. However, currently, there is not an efficient and accurate segmentation algorithm within GEE [1].

  5. One of the main approaches to improve classification accuracy is increasing the number of training samples or input features. However, users are limited to employ only a certain amount of samples or a limited number of features within classification methods [1], [16].

  6. Complex machine/deep learning algorithms which require large training datasets or longer training times are not performed in GEE due to computational restrictions. Thus, users need to implement these algorithms outside of this environment [48].

  7. When trying to download processed data in the middle of the workflow for further analysis in a third-party software environment, users face a time-consuming process due to huge map size and internet speed limitations.

  8. Complex SAR phase data are not stored in GEE because they are not compatible with the tiling concept of the infrastructure [51]. This limits the Polarimetric SAR and Interferometric SAR applications, which relies on the phase information.

SECTION VI.

GEE Pattern of Publications

In this study, 450 journal papers, published between January 2010 and May 2020, were assessed to depict the pattern of GEE publications. Several investigations, including keyword analysis, annual publication numbers, and geographical distribution are provided in the following sections. Additionally, the top journals and conferences, which have published GEE papers are discussed in Section VI-E.

A. Analysis Method

A meta-analysis was performed in the Elsevier's Scopus (the largest abstract and citation database of peer-reviewed literature covering over 5000 publishers) and Web of Science (formerly known as ISI Thomson) to provide a comprehensive literature trend conducted using GEE. It is worth noting that conference articles and presentations were also reviewed during the course of this study; however, they were not considered in this study because most of them had a relatively lower academic level or had later been converted to journal papers. Only the top conferences, where GEE studies were presented, were provided in Table III. The Google Earth Engine and GEE search queries were performed in the journal articles’ titles, abstracts, and keywords from January 2010 to mid-May 2020. The EndNote software was then used to remove the duplicate articles, which resulted in 462 peer-reviewed journal articles. Subsequently, 12 papers, which discussed unrelated topics (e.g., using GEE for gaming development and analyzing the computational performance of GEE) were discarded. Finally, 450 journal articles were selected for further analyses.

B. Keyword Analysis

Fig. 4 illustrates a word cloud visualization based on the keywords in these GEE studies. The more frequent the term appears within the keyword analysis, the larger the word depicts in the figure. As clear, Google Earth Engine, Landsat, Remote Sensing, Sentinel-2, Random Forest, Cloud Computing, NDVI, Machine Learning, and Land Cover were the mostly used keywords, respectively. For example, Google Earth Engine keyword was utilized in 278 papers. The name of different satellites and machine learning algorithms are also widely used in GEE publications. Landsat, Sentinel-2, MODIS, Landsat-8, and Sentinel-1 are the satellites and Random Forest is the classification method, which are frequently utilized in the keyword lists of GEE journal papers. It was also observed that NDVI, land cover, classification, and Urbanization were among the most used keywords, indicating the popularity of LCLU classification applications. Additionally, the Time Series, Change Detection, Climate Change, Land Cover Change, and Time Series Analysis keywords were frequently utilized in the GEE publications, indicating the importance of the archived open-access remote sensing datasets in change detection studies. Furthermore, multiple journal publications used China, United States, and Africa in the keywords, demonstrating the leadership of the corresponding countries in utilizing GEE in their studies.

Fig. 4. - Word cloud of the keywords from the GEE journal articles.
Fig. 4.

Word cloud of the keywords from the GEE journal articles.

C. Annual Publication Numbers

The statistical analysis of the number of publications related to GEE is provided in Fig. 5. The first peer-reviewed journal paper was published in 2011 by Keller et al. [52] in PFG - Journal of Photogrammetry, Remote Sensing and Geoinformation Science. This study investigated the automated generation and presentation of historical 4-D city models. As clear from Fig. 5, a substantial increase in the number of GEE publications was observed from 2017, when Gorelick et al. [6] (GEE developers) discussed a comprehensive utility of this cloud computing platform, particularly for RS applications. Additionally, it was observed that the increasing trend in the number of GEE publications is getting more substantial. For instance, 35 journal articles were published within the last 1.5 months (April 1–May 15, 2020).

Fig. 5. - Number of journal articles, which utilized GEE.
Fig. 5.

Number of journal articles, which utilized GEE.

D. Geographical Distribution

The study areas of the peer-reviewed GEE journal papers were investigated to provide a picture of the geographic distribution of GEE studies. Fig. 6 illustrates the geographical distribution of GEE studies after removing ten papers, which did not belong to any study areas (e.g., literature review papers and papers related to the theoretical aspects and development of the GEE platform). Additionally, 35 studies which were conducted over the continental scales (7 and 1 papers covered the entire Africa and Europe, respectively) and global scales (27 papers covered the entire world) were not considered in a country-based enumeration. Furthermore, if a study was conducted over several countries, all of them were counted separately. Moreover, studies with sub-country scales (e.g., small study site, city, or province) were considered in the number of publications for the corresponding countries. Finally, it was observed that GEE studies were conducted over 138 countries. The highest number of GEE publications have been conducted over the United States (97 articles), China (96 articles), Brazil (29 articles), Canada (25 articles), and India (25 articles), followed by Australia (19 articles), and Indonesia (15 articles), respectively. On the continental scale, 37.5%, 24%, 18.5%, 9%,7.5%, 3.5%, and 0% of studies were conducted over Asia, North America, Africa, Europe, South America, Australia, and Antarctica, respectively. 38% of studies conducted over Asia were related to China.

Fig. 6. - Number of GEE studies conducted over each country.
Fig. 6.

Number of GEE studies conducted over each country.

E. Journals and Conferences

Table II provides the top journals, in which GEE studies have been published. 450 journal papers have been published in 150 journals, 95 of which have published only one GEE paper. Based on the results, the Remote Sensing, Remote Sensing of Environment, ISPRS Journal of Photogrammetry and Remote Sensing, and International Journal of Applied Earth Observation and Geoinformation were the top four journals, which published 126, 61, 14, and 12 papers, respectively.

TABLE II Top 10 Journals Publishing GEE-Related Articles Along With the Number of Publications Per Journal
Table II- Top 10 Journals Publishing GEE-Related Articles Along With the Number of Publications Per Journal

Table III provides the name of several conferences, in which GEE studies have been most frequently presented. GEE studies are among the most presented research topics in the prominent international RS conferences, such as the International Archives of the Photogrammetry Remote Sensing and Spatial Information Sciences ISPRS Archive, the International Geoscience and Remote Sensing Symposium IGARSS, and the Proceedings of SPIE The International Society for Optical Engineering.

TABLE III GEE Studies Presented at the Top Eight Conferences Along With the Number of Presented Articles
Table III- GEE Studies Presented at the Top Eight Conferences Along With the Number of Presented Articles

SECTION VII.

GEE Applications

The 450 selected GEE journal articles were studied to decide about the main disciplines. It is worth noting that seven papers which were review articles were initially removed from the analysis. Consequently, all the 443 journal papers were divided into 10 categories as illustrated in Fig. 7 along with several keywords describing each category. The articles which include more than one application were considered in the most relevant category by an in-depth review of the paper. It is worth noting that although only the journal articles were investigated to adopt the main disciplines, it was observed that other sorts of publications (e.g., conference papers) correspond well with the application types considered in this study.

Fig. 7. - GEE applications (LC: Land Cover).
Fig. 7.

GEE applications (LC: Land Cover).

Fig. 8 illustrates the number of journal articles related to each application provided in Fig. 7. The highest number of contributions were in the Vegetation category with 90 papers followed by 77 papers in Agriculture, 68 papers in Hydrology, 53 papers in Land cover, 40 papers in Urban, 40 papers in Natural disaster, 31 papers in Atmosphere and climate, 17 papers in Image processing, and 14 papers in Pedosphere. Moreover, 13 papers, which were not related to any of the 10 application types or their numbers and not enough to be assigned to a new category were considered in the Others category.

Fig. 8. - Number and percentage of journal papers related to GEE applications, published in each discipline provided in Fig. 7.
Fig. 8.

Number and percentage of journal papers related to GEE applications, published in each discipline provided in Fig. 7.

In the following sections, more information about each of the GEE applications along with several case studies are discussed.

A. Vegetation

Vegetation (e.g. forest, grassland, rangeland, and shrub) can be considered as one of the most vital components of the Earth's biosphere because it serves critical functions to both humans and the environment [53]. Vegetation is also important in many biochemical cycles that are directly or indirectly interacting with water, soil, and air [54]. Such cycles are important for global vegetation pattern and climate studies and, thus, vegetation is also important for biodiversity conservation and climate change mitigation [55]. Moreover, vegetations are the primary source of converting dioxide carbon to oxygen, enabling aerobic metabolism on the globe [56]. Considering the important services of vegetation, it is highly required to monitor the current state and dynamics of various vegetation types. GEE leverages cloud computing services for long-term monitoring of vegetation covers. Furthermore, the publicly available RS data within GEE enable researchers to employ this platform for vegetation monitoring at various spatial scales. In particular, the existence of several vegetation indices in GEE allows conducting vegetation studies in efficient and quick manners. GEE has been widely used for vegetation mapping [57], [58], vegetation dynamics monitoring [59], [60], deforestation [61], [62], vegetation and forest expansion [63], [64], forest health monitoring [65], [66], forest mapping [67], [68], pasture monitoring [49], [69], and rangeland assessment [70], [71]. For instance, the full archive of the Landsat imagery was processed within GEE to map the vegetation dynamics from 1988 to 2017 in Queensland, Australia [59]. Field observations were utilized to evaluate the performance of the proposed algorithm and an overall accuracy of 82.6% was reported. Finally, the suitability of GEE for large-scale and long-term vegetation monitoring was reported along with an approximately 20% decrease in the vegetation cover in this study area. The authors emphasized the high computational efficiency of GEE compared to when they did the same analysis using traditional methods. In another study, an algorithm was developed within GEE by employing spectral mixture analysis to detect degradation and deforestation in the Brazilian state of Rondônia [62]. To this end, Landsat archived images from 1990 to 2013 were used. All the required processing steps were performed within GEE to produce annual forest disturbances maps. Landsat data were transformed into spectral endmember fraction and were applied to calculate the Normalized Degradation Fraction Index. The presented method obtained producer accuracies of 68.1% and 85.3% for degradation and deforestation maps, respectively.

B. Agriculture

Mapping and monitoring croplands and plantations are essential for food security. Food security could be stated as one of the most significant issues in the current era and, thus, the Food and Agriculture Organization (FAO) has set its goal to achieve food security around the globe [72], [73]. Agricultural products not only play a vital role in human life, but also are critical from economic aspects. Therefore, agriculture can be considered as a source of livelihood and a contributor to national revenue [74]. Moreover, monitoring agricultural products is required for policy-makers and governments to ensure the path to economic growth and self-sufficiency of the country [72]. RS datasets allow frequent and cost-effective monitoring of croplands and plantations. GEE hosted extensive publicly available RS datasets that can be effectively utilized for productivity, quality, profitability, and sustainability studies of agriculture production. Researchers have applied GEE to plantation mapping and monitoring [75], [76], phenology-based classification [77], [78], cropland mapping [79], [80], crop condition monitoring [81], [82], crop yield estimation [83], [84], irrigation mapping [85], [86], and other agricultural studies [87], [88]. For example, seasonal median composites of Sentinel-1 and Sentinel-2 were calculated in GEE to predict the Maize yield in Kenya and Tanzania [83]. The use of RF resulted in the production of Maize/none Maize maps in Kenya and Tanzania with 63% and 79% overall accuracies, respectively. Finally, satellite observations along with gridded soil datasets were ingested into a scalable harmonic regression to estimate Maize yield. Moreover, multitemporal Landsat-8, Landsat-7, and Sentinel-2 imagery were employed to calculate composite NDVI images for winter cropland mapping in an area of over 200 000 km2 [77]. Then, the multitemporal NDVI curve was inserted into a CART algorithm to produce a phenology-based map of winter cropland with an overall accuracy of 96.22%. The authors reported that lacking remote sensing images with high temporal frequency in GEE was one of the limitations of their work and, thus, suggested to use Chinese GaoFen satellite data with four days revisit time for the future cropland classifications.

C. Hydrology

Water is an essential element for life whether in liquid form (e.g., lake, reservoir, and river) or solid forms (e.g., snow, ice, and glaciers) in the cryosphere and, thus, obtaining reliable information about water resources is a high necessity. In addition, monitoring inland, coastal, and arctic water resources are beneficial in climate change studies [89]. Moreover, investigating the size and behavior of glaciers along with the amount of snow ablation could render supporting information about the Cryosphere–Atmosphere interactions and climate change [90], [91]. Furthermore, drought and flood disasters are relatively associated with the dynamics of water resources [92]. Therefore, persistent and precise monitoring of all types of water resources is a vital need. Publicly available datasets within GEE along with its high computing performance allow for accurate monitoring of water resources with adequate temporal and spatial resolutions. Consequently, GEE was efficiently employed for surface water dynamics monitoring [93], [94], bathymetry [20], [95], shoreline and coastal studies [96], [97], lake and reservoir mapping and monitoring [98], [99], glacier studies [90], [100], snow ablation and snow mapping [92], [101], suspended sediments and river studies [102], [103], and water health assessment [104], [105]. For instance, Nguyen et al. [93] introduced a fully automatic method for water extraction in New Zealand. The GEE and Landsat-8 images between 2014 and 2018 were employed to map lakes and reservoirs using an Automatic Water Extraction Index with an overall accuracy of 85%. In a different study, GEE was used to combine MODIS fractional snow cover with Sentinel-1 wet snow mask data to develop an algorithm to produce a monthly wet-dry snow map [92]. In this study, 2.5 years were studied in the Indian Himalayan region covering around 55 000 km2. It is worth noting that the underestimation of the wet snow area was corrected by DEM. In another study, blue and green bands of Sentinel-2 were processed to develop an empirical model for satellite-derived bathymetry maps [20]. In this regard, cloud masking, sun glint correction, radiometric calibration, and normalization were performed within GEE in three sites of the Aegean Sea in the Eastern Mediterranean. Finally, based on 9818 reference points, the proposed approach achieved R2 and RMSE of 0.9 and 1.67 m, respectively. The authors argued that GEE time-out error was the main limitation in their work, because their empirical method required estimation of the regression between the image composite values and water depth over large region and a long period of time.

D. Urban

Urban areas are regions with concentrated people and human infrastructure and usually expand through the time for better livelihood. These regions have become the central point of economic, social, cultural, and recreational activities, as well as resource consumption [106], [107]. Therefore, urban areas could be considered as the primary source of human interaction with the surrounding environment. The environment and the urban areas affect each other mutually since the environmental changes could influence human life. On the other hand, unrestricted urban growth causes severe damage to natural resources and can negatively alter the atmosphere and climate [108], [109]. Conducting urban studies are essential to support sustainable development. In this regard, RS datasets enable the quantification and profound analysis of urban dynamics that are fundamental for devising suitable approaches for urban development and urban planning [110]. GEE promotes long-term monitoring of urban conditions to effectively study the urban environment from different aspects. Urban expansion and extent mapping [18], [111], urban morphology and local climate zone monitoring [112], [113], urban 4-D modeling [52], urban green space classification [114], [115], urban temperature and urban heat island identification [17], [110] are some of the main urban studies conducted within GEE. For instance, Ravanelli et al. [17] studied the long-term monitoring of Surface Urban Heat Island (SUHI) and its relation to urban land cover changes over six metropolitan areas of the United States. More than 6000 Landsat images were interpreted between 1992 and 2011 by Detrended Rate Matrix analysis to illustrate the land cover change versus SUHI. It was reported that GEE was the best solution for their applications in terms of efficiency in time, cost, and computation. The results revealed a definite increase of SUHI due to urban growth. Moreover, Gong et al. [18] investigated the urban expansion dynamics by producing annual global artificial impervious surfaces that are predominate indicators of human settlement. To this end, the full archives of Landsat satellite data between 1985 and 2018 were processed within GEE. Sentinel-1 SAR data and nighttime images were also used to improve the final results in arid areas. The implementation of the Exclusion–Inclusion algorithm combined with the temporal consistency check within GEE yielded the overall accuracy of over 90% in mapping annual global impervious surfaces.

E. Land Cover

The dominant land cover types of a region determine the terrestrial surface characteristics of the corresponding area. Vegetation, water, and soil are the main land cover types spread across the globe. These land cover types form environmental conditions for the habitat of various flora and fauna [116], [117]. Furthermore, the distribution of land covers defines the physical interaction between Earth's surface and the surrounding environment. Recognizing the significant impacts of land covers on the environment and investigating the current condition along with monitoring long-term dynamics of land covers are essential for sustainable development, climate change modeling, biodiversity studies, and natural resource monitoring [118], [119]. GEE hosted enormous publicly RS datasets in various spectral and spatial resolutions to conduct land cover mapping [14], [120], land cover dynamics monitoring [121], [122], coastal mapping [123], [124], and wetland classification [125]. For instance, an automatic land cover mapping was developed within GEE through the integration of Landsat imagery and RF algorithm over the north of China [14]. The reference samples were collected by rules of pixel and spectral filtering from MODIS land cover products with the International Geosphere-Biosphere Program theme in ten classes. Two types of monthly and percentile features were utilized separately, and the best result was obtained through the usage of monthly features by achieving over 80% accuracy. In another study, GEE was used to produce a sharpened land cover map over Mato Grosso, Brazil [15]. Their proposed algorithm (BULC-U) fused the 300 m the GlobeCover product with Landsat imagery to produce a 30 m land cover map. In this regard, Landsat images were segmented and then the ISODATA algorithm was applied to generate an unsupervised map in 20 clusters. Finally, the unsupervised classification result was fused to the GlobeCover product. More recently, Ghorbanian et al. [126] produced an improved version of the land cover map of Iran using Sentinel-1/2 imagery within GEE. They also proposed an automatic workflow to update this map every year without the need to collect additional in situ data using migrated samples.

F. Natural Disaster

Extreme and unexpected phenomena caused by the natural process of the Earth are called natural disasters. These events bring destruction to the surrounding environment and human life [127]. Profound research should be carried out to investigate the characteristics and behavior of these phenomena and, consequently, to reduce the amount of damage. The importance of geospatial data for monitoring and damage assessment of natural disasters is undeniable [128]. Long-term and NRT publicly available RS datasets within GEE along with its high-performance computing promote this cloud-based platform for monitoring, forecasting, prevention, vulnerability, and resilience studies of natural disasters. In particular, GEE was utilized for drought monitoring [129], [130], flood mapping and flood risk assessment [131], [132], wildfire severity mapping [133], [134], landslides analyses [135], hurricane studies [136], and tsunami studies [137]. For instance, MODIS and meteorological datasets were employed within GEE to study the temporal and spatial variations of drought events in Potohar Plateau of Punjab, Pakistan between 2000 and 2015 [129]. In this regard, multiple features of standard precipitation index, standard precipitation-evapotranspiration index, vegetation condition index, precipitation condition index, soil moisture condition index, and temperature condition index were utilized for drought monitoring. In addition, 44 Sentinel-1 GRD dual-polarized data were employed within GEE to develop an operational methodology for rapid flood inundation mapping in Bangladesh [131]. Moreover, a potential flood damage map was generated to support efficient decision making. The proposed method obtained 96.44% overall accuracy by incorporating 4500 reference samples. Finally, a preflood Landsat-8 image was used to generate a land cover map for further estimation of flood damages to cropland and rural settlements. It was reported that the developed algorithm within GEE could be effectively used for monitoring land covers in a cost-efficient approach because open-access Landsat datasets are regularly inserted into GEE. In a different study, very high-resolution oblique images were processed within GEE to detect irregularity in façade and rooftop areas caused by hurricane events [136]. First, a vertical building map was produced from a temporal analysis of predisaster images through an edge-based/knowledge-based approach. Then, pre- and postdisaster images were fused in the data level followed by spectral-only and geospectral classifiers through the RF algorithm. The results obtained a significant reduction in false-positive error.

G. Atmosphere and Climate

As a principal component of the natural process of the Earth system, land interacts with the atmosphere through biophysical and biochemical processes mutually [138]. Constant population growth and human activities result in significant changes in the atmospheric constituents [139]. Climate change and air pollution are two decisive consequences of these disturbances that directly impact the surrounding environment and human health [140], [141]. Therefore, it is essential to monitor and control air quality and climate conditions to avoid severe outcomes. The availability of climate products accompanied by surface products within GEE, make this platform a great tool for climate studies and air quality monitoring. These advantages create a rising interest in the research community to use GEE for air pollution analyses [142], [143], climate change and monitoring [144], [145], biophysical variable studies [21], [146], evapotranspiration estimation [147], and precipitation mapping [148]. For instance, GEE was employed to map exposed mine waste areas to estimate the corresponding emission of particulate matter to the atmosphere [142]. Four benchmark years of 1990, 2000, 2010, and 2018 as a part of Canada's Air Pollutant Emission Inventory were studied. Landsat-5, Landsat-8, Sentinel-1, and Sentinel-2 satellite data were used to map exposed mine through an RF algorithm. Finally, the authors reported that GEE was an invaluable platform for monitoring long-term emission from exposed mine waste. Furthermore, GEE was used along with version 1 Tropical Rainfall Measuring Mission (TRMM) precipitation products to study the spatial and temporal patterns of precipitation in the Zambezi River basin [148]. To this end, TRMM data from 1998 to 2017 were processed in GEE to investigate the precipitation trends and magnitudes by Kendall's correlation and Sen's slope reducer respectively. A “dry gets dryer, wet gets wetter” pattern was observed and reported in the study region.

H. Image Processing

In the current era, almost all EO platforms are equipped by digital sensors and, thus, terabytes of data are generated and stored in digital formats every day. As discussed, GEE hosts an immense number of digital images. The RS images are extensively utilized in various applications and for different purposes. Therefore, it is highly required to develop and enhance digital image processing algorithms to efficiently exploit the potential of digital images. Moreover, since the quality of every input data directly affects the final accuracy of studies, image processing must be considered a necessity. Precision, level of automation, reliability, computational complexity, and time-consumption are the most critical criteria in developing image processing algorithms [149], [150]. Therefore, to ensure high-quality results, it is inevitable to develop and enhance the existing image processing algorithms within GEE protocols. In this regard, researchers have employed GEE to develop various efficient and useful image processing algorithms, such as cloud masking [12], [149], data selection and enhancement [13], [150], image-based sensor calibration [151], [152], and training sample migration [153]. For instance, Kong et al. [150] introduced weighted Whittaker with a dynamic parameter (wWHd) denoising method within GEE to reconstruct the vegetation phenology based on 500 m MODIS EVI products. A large number of reference samples were used to compare the proposed method with four well-known denoising methods. The results, in terms of RMSE, roughness, and computational efficiency revealed the superiority of the proposed method. Furthermore, Li et al. [13] developed an algorithm to improve GEE's processing to efficiently acquire large-scale cloud-free Landsat images to support further applications. This method comprises cloud and shadow masking, snow/ice masking, and low-quality pixels removal by incorporating the quality band. Therefore, this method can efficiently prepare high-quality data for each region of interest. It was discussed that their algorithm was developed within GEE, and the open-access codes within this platform provided a simple framework with a flexible user-friendly interface. Finally, Kakooei et al. [154] proposed a global Sentinel-1 foreshortening mask to improve the reliability of SAR-based analysis.

I. Pedosphere

The Pedosphere is the outermost layer of Earth which dynamically interacts with the Biosphere and atmosphere [155]. Monitoring and studying the Pedosphere and the corresponding categories (e.g., soil, geology, and geomorphology) are prerequisites for sustainable development, especially in the climate modeling context [156]. Soil is the most significant component of the Pedosphere that has straight impacts on the surrounding environment and, thus, essential for biodiversity conservation and climate regulations [157]–​[160]. The availability of RS datasets in GEE makes it an appealing platform for the Pedosphere studies at diverse scales. GEE was utilized for digital soil mapping [50], [161], geology and mining [162], [163], geomorphology studies [164], soil topography mapping [165], soil moisture derivation [166], and soil carbon and salinity estimation [167], [168]. For example, Ivushkin et al. [168] applied Landsat-5 and Landsat-8 datasets within GEE to produce a global soil salinity map based on the thermal anomaly. They incorporated 15 188 reference points from ISRIC-world soil information. Seven soil salinity indicators of sand content, silt content, clay content, PH, bulk density, organic carbon content, and cation exchange capacity with thermal anomaly were fed to the RF algorithm. The final soil salinity map obtained overall accuracies of 67%–70% for six different times. Moreover, GEE capabilities and Landsat imagery were combined to automatically delineate the annual extent of surface coal mining in Central Appalachia between 1985 and 2015 [162]. To this end, the urban areas were masked using publicly available datasets and the mining zones were identified by low values in NDVI images. The proposed algorithm achieved Kappa coefficients varying from 0.62 to 0.93 for different years.

J. Others

Other than previously mentioned applications, there are multiple articles related to other applications of GEE, which were conducted with lower frequency. Therefore, their number was not enough to have a separate category and, thus, were assigned to the Others applications category. These studies are mainly related to archaeology [169]–​[171], 3-D printing [172], wildlife [173], [174], oil platform detection [175], and crashed airplane detection [176]. For example, 300 Landsat-8 images between 2013 and 2018 were processed in GEE to detect possible crashed airplane in the Cambodian jungle [176]. NDVI, albedo, thermal bands, spectral information, and panchromatic features were utilized in this study. Moreover, Sentinel-1 SAR data were used to automatically identify and delineate offshore oil platforms [175]. The proposed method was evaluated by 1577 reference samples and obtained an overall accuracy of 96.09% over the Gulf of Mexico. Furthermore, GEE was reported as a suitable platform to process high-resolution drone imagery for pottery shreds identification [169]. In this regard, texture and gradient features from RGB drone imagery were calculated within GEE and were ingested into the RF classifier. The developed algorithm was able to identify pottery shreds with 32.9% and 76.8% accuracies for two separate regions. Moreover, GEE was employed to process drone imagery to estimate the wildlife aggregation population [173]. To this end, the RF algorithm was applied to map targets of interest (bird nest) pursuit using a predictive model to estimate the population. The proposed approach obtained overall accuracies ranging from 86% to 96% over four different water bird colonies. Finally, a web-application called TouchTerrian was developed to simplify the 3-D terrain model printing [172]. After determining the region of interest, the corresponding DEM was obtained through GEE to be used for final 3-D printing. It was reported that users with any level of expertise could easily utilize their model within GEE with minimum computing resources requirements.

SECTION VIII.

GEE Large-Scale Case Studies

As discussed, the enormous capabilities of GEE resolve the existing challenges of processing big data over large-scale areas. Therefore, GEE has been recognized as an efficient platform for regional to global LC mapping and monitoring over long periods of time. In this section, ten studies conducted over the globe, continents, and big countries (e.g., the United States, Canada, and China) are discussed in detail.

A. Globe

Long et al. [133] proposed an automatic method for producing a global annual burned area maps using all available Landsat images acquired between 2014 and 2015 within the GEE cloud computing platform. The map of the burn degree was first generated using the RF classifier. Then, several logical filters (e.g., NDVI, Normalized Burned Ratio (NBR), and temporal filters) were implemented to select candidate seeds of the burned area. Finally, the global annual burned area map of 2015 (GABAM 2015) was produced by employing an iterative seed-growing process. A strong correlation (R2 = 0.74) was observed between the spatial distribution of the burned surfaces from the GABAM 2015 and the annual 250 m MODIS Vegetation Continuous Fields (VCF) Collection 5.1 (MOD44B) product.

Hansen et al. [8] analyzed forest cover changes at the global scale between 2000 and 2012 using Landsat time-series images within GEE. Based on the results, the authors reported the following:

  1. the tropical domain had the highest forest cover change (loss and gain) with annual deforestation rate of approximately 2101 km2/year;

  2. most forests in the subtropical climate domain were considered as croplands, because the existence of long-lived natural forests in this domain was relatively rare;

  3. the trend of change in temperate forests was almost constant and had a low ratio of loss compared to gain;

  4. fire was the most important cause of deforestation in the boreal domain;

  5. the speed of deforestation in Brazil was more than other countries.

In [177], a grid-based Mountain Green Cover Index (MGCI) was implemented to monitor mountain ecosystems at large scales. A novel frequency- and phenology-based technique was applied to generate the global green vegetation cover using all available Landsat-8 images within the GEE platform. Then, the real surface area generated from ASTER GDEM Version 2 was applied to calculate the MGCI model instead of the planimetric surface. The results showed that the generated data had a high correlation (R2 = 0.9548) with FAO MGCI baseline data.

In [178], global surface water and its long-term changes were mapped over three decades of Landsat satellite images (three million images) within the GEE platform. The result of this global assessment demonstrated the following:

  1. permanent water bodies disappeared by approximately 90 000 km2 and new water bodies covering 184 000 km2 formed between 1984 and 2015;

  2. the permanent net water of all continental regions increased except for Oceania;

  3. over 70% of global net permanent water loss occurred in the Middle East and Central Asia due to drought and human actions (e.g., river damming).

It is finally argued that the proposed strategy within GEE can be effectively used for water resources management.

Scherler et al. [90] proposed a novel automatic method to map supraglacial debris cover over the globe using multitemporal optical satellite images within GEE. In this study, debris-covered ice surfaces were generated by thresholding of three indices, including red to Shortwave Infrared band ratio, the Normalized Difference Snow Index, and linear spectral unmixing-derived Fractional Debris Cover. These indices were generated based on Landsat-8 and Sentinel-2 optical satellite images in 19 glacier areas at the world-scale from 2013 to 2015. The results showed that 4.4% (about 26 000 km²) of all glacier areas is affected with debris. Furthermore, an inverse relationship between glacier size and percentage of debris was also reported, indicating continuous shrinking glaciers due to the debris effects.

B. Continent and Big Countries

Amani et al. [1] produced the first Canadian wetland inventory (CWI) map using Landsat-8 imagery and several advanced algorithms available within GEE. In this study, 30 000 scenes of Landsat-8 images were used along with machine learning algorithms in GEE. The RF algorithm was applied to classify wetlands over the entire Canada. The CWI map was based on five wetland classes, defined by the Canadian Wetland Classification System: bog, fen, marsh, swamp, and shallow water. The quantity and quality of the results showed that the generated CWI map had reasonable accuracy considering the challenges existing over this immense country (9.985 million km2).

Li et al. [179] generated African LCLU map at a 10 m resolution within GEE using multisource RS datasets, including Sentinel-2, Landsat-8, Global Human Settlement Layer, Night Time Light data, Shuttle Radar Topography Mission (SRTM), and MODIS Land Surface Temperature images. The RF algorithm was applied to classify the area into five categories of urban, trees, low plants, bare soil, and water. The results showed that the LCLU map generated by this method had a better performance than that of the FROM-GLC10 [180] in detecting urban class and distinguishing trees from low plants in rural areas.

Beresford et al. [181] developed an NRT monitoring framework for conservation of the Key Biodiversity Areas (KBAs) in Africa using the GEE platform. In this study, simple repeatable techniques were proposed to detect changes in fire rate, tree loss, and nighttime lights between 1992 and 2013. The results showed that fire rate, nighttime lights, and rate of forest loss considerably increased in KBAs and ecoregions. Moreover, the authors argued that the method implemented within GEE has a high potential for monitoring changes over any geographic area and using different RS data types and could be effectively utilized by conservation end-users.

Teluguntla et al. [182] developed a precise Landsat-based cropland extent product over Australia and China using machine learning tools in GEE. In this study, cropland maps were produced by applying RF to Landsat-8 images. The RF classifier was trained and validated using ground truth data obtained from different resources, such as field surveys, very high spatial resolution (5 m) imagery, and several other auxiliary information. Based on their results, the total cropland areas of Australia and China were estimated as 35.1 and 165.2 million hectares, respectively.

Goldblatt et al. [183] used GEE for temporal analysis of large urban areas in India using multitemporal Landsat-7 and Landsat-8 images. In order to generate high-quality maps of built-up areas, the country was classified into the built-up and non-built-up regions using 21 030 training datasets and three types of supervised classification algorithms (i.e., SVM, CART, and RF). It was reported that the proposed GEE approach generated a high-quality map of built-up areas in India and can be potentially employed in other countries.

SECTION IX.

Conclusion

The proliferation of big geo data and the recent advance in cloud computing and big data processing services are changing the future of RS. In this regard, GEE is effectively paving the road for researchers, scientists, and developers to be able to easily extract valuable information from big RS datasets without the burdens of traditional data analysis methods. The massive troves of RS datasets available with GEE (e.g., archived Landsat and Sentinel images) helps researchers to address global challenges and environmental issues, such as global warming, climate change, LCLU classification over large areas, and monitoring landscape over several decades. GEE also contains hundreds of prebuilt functions which can be easily understood and utilized by different users. Through a basic knowledge of JavaScript, users can also implement their own algorithms. These advantages make any user employ this cloud computing platform for various applications related to LCLU, agriculture, hydrology, natural disaster, etc. Besides all the advantages, it also has several limitations, such as limited storage of 250 GBs for each user and limited memory to train machine learning algorithms, which may push a new user backward. However, it is undeniable that GEE presents a novel way of processing geospatial data and resolves several big data challenges existed for RS researchers. Based on the GEE publication trends, it is also clear that this platform is becoming more popular not only among the RS researchers but also within any community interested in using EO datasets.

Authors’ contribution

Meisam Amani designed and supervised the study, professionally optimized all sections, acquired funding, wrote the Abstract, Introduction, and Conclusion; Arsalan Ghorbanian wrote the “GEE Applications” section; Seyed Ali Ahmadi wrote the “GEE Advantages and Limitations” section; Mohammad Kakooei wrote the “GEE Datasets” and “GEE Functions” sections; Armin Moghimi wrote the “GEE Large-scale Case Studies” section; Seyed Mohammad Mirmazloumi gathered the required articles and wrote the “GEE Pattern of Publication” section; Sayyed Hamed Alizadeh Moghaddam wrote the “GEE Platform Overview” section; Sahel Mahdavi professionally optimized all sections; Masoud Ghahremanloo helped in initial literature review; Saeid Parsian helped in gathering the required articles; Qiusheng Wu and Brian Brisco professionally optimized all sections. Finally, all authors read and approved the final manuscript.

Appendix

TABLE IV List of Available Datasets Within GEE
Table IV- List of Available Datasets Within GEE
TABLE IV Continued
Table IV- Continued
TABLE IV Continued
Table IV- Continued

References

References is not available for this document.