If you are like me, you have a lot of assets uploaded to Earth Engine. As you upload more and more assets, managing this data becomes quite a cumbersome task. Earth Engine provides a handy Command-Line Tool that helps with asset management. While the command-line tool is very useful, it falls short when it comes to bulk data management tasks.

What if you want to rename an ImageCollection? You will need to manually move each child image to a new collection. If you wanted to delete assets matching certain keywords, you’ll need to write a custom shell script. If you are running low on your asset quota and want to delete large assets, there is no direct way to list large assets. Fortunately, the Earth Engine Python Client API comes with a handy ee.data module that we can leverage to write custom scripts. In this post, I will cover the following use cases with full python scripts that can be used by anyone to manage their assets:

  • How to get a list of all your assets (including folders/sub-folders/collections)
  • How to find the quota consumed by each asset and find large assets
  • How to rename ImageCollections

The post explains each use-case with code snippets. If you want to just grab the scripts, they are linked at the end of the post.

Continue reading

Multi-criteria Overlay Analysis is the process of the allocation of land to suit a specific objective on the basis of a variety of attributes that the selected areas should possess. Although this is a common GIS operation, it is best performed in the raster space.

This post outlines the typical workflow to take source vector data, transform them to appropriate rasters, re-classify them and perform mathematical operations to do a weighted suitability analysis. The post uses the command-line utilities provided by the open-source GDAL library. If you want to do such analysis in QGIS, please check my tutorial at Multi Criteria Overlay Analysis (QGIS3)

We will work with crime and infrastructure data for the city of London and find suitable areas to build new parking facilities that can help reduce bicycle thefts. Our analysis will apply the following 3 criteria. The proposed parking must be

  1. In a bicycle theft hotspot
  2. Close to a bicycle route
  3. Far from existing parking facilities
The problem statement: Identify suitable locations for building new bicycle parking facilities
Continue reading

Matplotlib has functionality to created animations and can be used to create dynamic visualizations. In this post, I will explain the concepts and techniques for creating animated charts using Python and Matplotlib.

I find this technique very helpful in creating animations showing how certain algorithms work. This post also contains Python implementations of two common geometry simplification algorithms and they will used to create animations showing each step of the algorithm. Since both of these implementations use a recursive function, the technique shown in the post can be extended to visualize other recursive functions using matplotlib. You will learn how to create animated plots like below.

Continue reading

Many applications require replacing missing pixels in an image with an interpolated value from its temporal neighbours. This gap-filling technique is used in several applications, including:

  • Replacing Cloudy Pixels: You may want to fill gap in an image with the best-estimated value from the before and after cloud-free pixel.
  • Estimating Intermediate Values: You can use this technique to compute an image for a previously unknown time-step. If you had population rasters at 2 different years and want to compute a population raster for an intermediate year using pixel-wise linear interpolation.
  • Preparing Data for Regression: All of your independent variables may not be available at the same temporal resolution. You can harmonize various dataset by generating interpolated raster at uniform or fixed time-steps.

Google Earth Engine can be used effectively for gap-filling time-series datasets. While the logic for linear interpolation is fairly straightforward, data preparation for this in GEE can be quite challenging. It involves use of Joins, Masks and Advanced Filters. This post explains the steps with code snippets and builds a fully functioning script that can be applied on any time-series data.

Continue reading

Most optical satellite imagery products come with one or more QA-bands that allows the user to assess quality of each pixel and extract pixels that meet their requirements. The most common application for QA-bands is to extract information about cloudy pixels and mask them. But the QA bands contain a wealth of other information that can help you remove low quality data from your analysis. Typically the information contained in QA bands is stored as Bitwise Flags. In this post, I will cover basic concepts related to Bitwise operations and how to extract and mask with specific quality indicators using Bitmasks.

Continue reading

In this post, I describe how we can use built-in QGIS processing tools to create a workflow to split polygons into equal parts. Using a clever algorithm and Feature Iterator tool in the Processing Framework, we can easily split all features in a given polygon layer into equal parts.

The algorithm for splitting any polygon shape into equal parts is described in this post PostGIS Polygon Splitting by Paul Ramsey. We will see how this can be implemented in QGIS.

Continue reading

In this post, I will outline techniques for computing weighted-centroids in both QGIS and Google Earth Engine. For a polygon feature, the centroid is the geometric center. It can also be thought of as the average coordinate of all points within the polygon. There are some uses cases where you may want to compute a weighted-centroid where some parts of the polygon gets higher ‘weight’ than others. The main use-case is to calculate a population-weighted centroid. One can also use Night Lights data as a proxy for urbanized population and calculate a nightlights-weighted centroid. Some applications include:

  • Regional Planning: Locate the population-weighted centroid to know the most accessible location from the region.
  • Network Analysis: For generating demand points in location-allocation analysis, you need to convert demands from regions to points. It preferable to compute populated-weighted centroids for a more accurate analysis.

Do check out this twitter-thread by Raj Bhagat P for more discussion on weighted centroids.

Different Weighted Centroids for the State of Karnataka, India (2015)
Continue reading

I recently taught a 1-month long course on GIS Applications in Urban and Regional Planning. We explored how GIS can be applied to solve problems in 6 different thematic areas. In this post, I will outline different applications and show concrete examples of using open-datasets and open-source GIS software QGIS.


The full course material – including data packages and PDF handouts – is now available for free download. Scroll down and find the download link at the end of the post.

Here are the 6 thematic areas

  • Land Use Planning and Management
  • Crime Mapping and Analysis
  • Solid Waste Management
  • Urban Infrastructure and Utilities
  • Urban Transportation
  • Spatial Planning
Continue reading

K-Means Clustering is a popular algorithm for automatically grouping points into natural clusters. QGIS comes with a Processing Toolbox algorithm ‘K-means clustering’ that can take a vector layer and group features into N clusters. A problem with this algorithm is that you do not have control over how many points end up in each cluster. Many applications require you to segment your data layer into equal sized clusters or clusters having a minimum number of points. Some examples where you may need this

  • When planning for FTTH (Fiber-to-the-Home) network one may want to divide a neighborhood into clusters of at least 250 houses for placement of a node.
  • Dividing a sales territory/ customers equally among sales teams with customers in the same region are assigned to the same team.

There is a variation of the K-means algorithm called Constrained K-Means Clustering that uses graph theory to find optimal clusters with a user supplied minimum number of points belonging to given clusters. Stanislaw Adaszewski has a nice Python implementation of this algorithm that I have adapted to be used as a Processing Toolbox algorithm in QGIS.


I have heard feedback from users that this algorithm doesn’t work on all types of point distributions and may get stuck while finding an optimal solution. I am looking into ways to improve the code and will appreciate if you had feedback.

Continue reading