Geospatial

5 Things to Consider about Google Earth Engine

Google Earth Engine (GEE) is no doubt a powerful service, kindly provided for free by Google. However, it is not a one-size-fits-all solution...

by Martin D. Maas, Ph.D

@MartinDMaas

Last updated: 2021-06-28

A free platform for big data in remote sensing, by Google

GEE allows processing massive amounts of remote sensing data directly in Google’s servers, enabling planetary-scale data analysis for free… no wonder why its use is rising quickly among researchers.

As appealing as it is in certain cases, Google Earth Engine is not a proper fit for many projects. Importantly, it’s free only for non-commercial, non-production use.

In a few words, Google Earth Engine should be considered as a rapid-prototyping tool for Geospatial applications.

Let’s get into it, but before, let’s just go over a brief summary:

Disadvantages of Google Earth Engine

If you fall under any of this use cases (e.g. you are considering moderate resolution datasets, or happen to be interested in regional studies, are looking to be employed by a geospatial analytics company, or start your one down the road, etc), then you might want to think twice about investing considerable time into learning how to use this closed-source propietary platform.

What GEE solves in many use cases (those which are not global-scale high-resolution applications) is more of a Software Engineering problem, than an infrastructure problem.

Fortunately, doing your own Software Engineering for Remote Sensing applications it’s not that difficult anymore.

There will be more posts in this blog about how to set up a Python workflow to be able to get things done Remote Sensing, in a truly open-source fashion (i.e. without relying on any closed-source proprietary platform).

GEE imposes a restricted programming framework.

GEE provides, on one hand, a set of objects that are handled exclusively by the server (i.e. image collections), and on the other hand, client-side variables, which are only handled by the browser.

The parallel programming framework chosen by Google is based on Map and Reduce operations. Each of these operations is applied to each image in the collection independently and can be roughly interpreted as ‘filtering’ and ‘aggregating’. For example, to select a certain area or to specify a range of dates from an image collection we would apply a certain ‘map’ operation, and to summarize the selected data according to various statistics we need to apply a ‘reduce’ operation.

While this model enables massive parallelism in distributed commodity servers, it does so at the expense of introducing a complex coding style. As it has been noticed in this conference paper, the combination of server and client-side programming tends to be confusing. For example, simple index iteration is not recommended when working with server-side objects, because the index itself is a client-side variable. As reported in the documentation, to iterate an image collection, we must define a certain recursive function, which cannot modify values outside of the function’s scope, among other limitations.

With this in mind, it is quite clear that porting sophisticated applications into the GEE framework can be quite challenging.

GEE is only free for non-commercial use

To use Google Earth Engine, compliance with the license agreement is required. The license agreement states explicitly:

Earth Engine’s terms allow for use in development, research, and education environments. It may also be used for evaluation in a commercial or operational environment, but sustained production use is not allowed. Additionally, data products generated by Earth Engine may not be sold.

This is in contrast with the underlying data, which is often in the public domain. For example, NASA is a US federal agency, and as such, cannot claim copyright of the material made available to the public. (see also NASA’s policy page)

Google also claims to be offering commercial licensing options for their platform, but the usage conditions and pricing are nontransparent (i.e. write to us if you are interested).

In sum, Google Earth Engine is targeted at non-profit institutions. And even though a good number of research is carried out in a non-for-profit spirit, many researchers could be interested in potential spin-offs of their research, especially as geospatial data analysis is thriving business right now.

It well might be the case that eventually Google will start promoting a paid service around Google Earth Engine for commercial applications, in which case, this disadvantage might turn just into a matter of price.

The free version is not suited for ‘production’ workloads

GEE’s FAQ also states that sustained production use is not allowed.

So even if we are working in a non-profit setting, we can’t rely on GEE for goals such as continued, real-time environmental monitoring.

Sustained production use is not only in violation of GEE’s terms and services, it is also impossible to do, as workloads with estimated times to completion longer than 5 minutes will be executed in batch mode, at indeterminate times. Presumably, if a user is in violation of the terms and services, GEE will stop executing those batch jobs.

The free version has some processing and storage limits

While GEE is generally free, there are a few caveats concerning processing time and storage limits.

As for processing, there are two available modes within GEE: interactive and batch-mode. The interactive mode is extremely fast but limited to jobs that can be served in under 5 minutes of processing time. The batch mode, on the other hand, is considerably slower and requires the export of data to various Google’s services, such as Cloud Compute or Google Drive, where users can incur some costs, albeit probably small ones.

GEE is inconvenient for smaller datasets

While not exactly a limitation of GEE, something else to take into consideration is that GEE might not be necessary at all in many cases, as many important remote sensing datasets can be handled perfectly by desktop computers. And a system that is designed for handling massive parallelism and Petabyte-scale datasets isn’t the optimal choice for dealing with smaller datasets.

A good number of remote sensing applications require fast access to data that can perfectly fit in just a few Terabytes, or even a few hundred GB. SSD drives currently sell for under $40 per TB and even an inexpensive laptop is extremely powerful nowadays.

Additionally, open-source software libraries, which is in some cases are provided by Space Agencies themselves, and importantly, under permissive copyleft licensing conditions, are making great progress in easing the very first few steps of the Remote Sensing pipeline (downloading, opening, and creating time-series of data).

The power and freedom associated with owning a modern workstation should not be underestimated, and the cloud providers’ marketing of their infrastructure as a one-size-fits-all solution can be disregarded in some cases.

Conclusion

As appealing as it is in certain cases, Google Earth Engine is not a proper fit for many projects. Importantly, it’s free only for non-commercial, non-production use.

In sum, GEE should be considered as a rapid-prototyping tool for Geospatial applications.

In cases where the possibility of getting involved in commercial applications or sustained production use of remote sensing, taking the time to acquire the skills to work independently from Google Earth Engine can be worthwhile, especially as the cost of powerful desktop workstations and hard drives are going down, while the quality of open-source remote sensing software licensed under permissive conditions is going up.

Finally, don’t forget to check out this site for updates, as forthcoming posts will cover the required skills to work directly with space agencies’ data, relying on open-source software libraries. For example, you might want to check out the other posts in this series, appearing on the left sidebar menu (or at the top, when viewing this page from mobile).