Aggregating Gridded Population Data in Google Earth Engine

Google Earth Engine makes it easy to compute statistics on gridded raster datasets. While calculating statistics on imagery datasets is easy, special care must be taken when working with population datasets. In this post, I will outline the correct technique for computing statistics for population rasters and aggregating pixels.

Calculating Total Population within a Region

The function reduceRegion() allows one to apply a statistical function (such as mean, median, count, sum etc.) on all pixels falling within the given region. You can use the ee.Reducer.sum() reducer to count the sum of all pixels values. Like many functions in Earth Engine, you must specify a scale value in the reduceRegion() function. The scale refers to the resolution at which the statistics will be computed. When computing sum on population datasets – always use the native resolution as the scale value. We can use the projection().nominalScale() function to find out the native resolution and use it for calculation.

var ghsl = ee.ImageCollection(
  "JRC/GHSL/P2016/POP_GPW_GLOBE_V1");

// Extract the projection before doing any computation
var projection = ee.Image(ghsl.first()).projection()
print('Native Resolution:', projection.nominalScale()) 
// 250 meters

// Filter and get image for the year of interest
var filtered = ghsl
  .filter(ee.Filter.date('2015-01-01', '2016-01-01'))
var pop2015 = ee.Image(filtered.first())

// Calculate Total Population for the City of Bengaluru
var bangalore = ee.FeatureCollection(
  "users/ujavalgandhi/public/bangalore_boundary");
var stats = pop2015.reduceRegion({
  reducer: ee.Reducer.sum(),
  geometry: bangalore.geometry(),
  scale: projection.nominalScale(),
})
// Result is a dictionary
// Extract the value and print it
print(stats.get('population_count')) // 9557563.54557092

This is the correct value for the population in the region. But if you used a different scale value, you will get a very different and wrong result. Let’s understand why.

Scale and Image Pyramids

GIS users do not have to worry about scale or resolution when computing stats on rasters. Because all computation runs on the native scale. Earth Engine works slightly differently. To allow large computations, Earth Engine provides users with the flexibility of running computation at any scale – regardless of the resolution of the source image. When an image is ingested in Earth Engine, many lower resolution versions of the image are pre-computed. These are known as Image Pyramids. For each level of the pyramid, a 2×2 block of pixels is grouped into 1 larger pixel. When computing statistics at a specific scale, Earth Engine will use the appropriate pre-computed lower-resolution Pyramid Tiles to calculate the results.

The default pyramid tiles are created using mean value – lower resolution tiles are created with the average value of intersecting higher resolution pixels. This is called resampling and is ok for image datasets. But you cannot resample population datasets. For example, when you create a 1km resolution raster from a 500m resolution data – you want to sum the smaller pixel values to know the population within the larger pixel. This process is called aggregation.

Aggregating Pixels

To aggregate pixels in Earth Engine, you must use the reduceResolution() function. Learn more about this function in the Earth Engine User Guide. We will now see how to aggregate the population rasters into lower resolution rasters.

Why Aggregate Pixels?

There are many use cases where you would want to do this. Here are a few

  • You want to compare different population datasets – each with a different pixel resolution. Aggregating them to a standard resolution will allow direct comparison between them. Read this article explaining the workflow.
  • You want to overlay the population with other demographic datasets. Aggregating both datasets to the same resolution will allow you to do raster algebra operations.
  • You want to compute stats for large regions. Aggregating pixels to a lower resolution will result in less number of pixels – reducing computation time.

We will now learn how to take the 250m resolution population raster from GHSL and aggregate it to 1km resolution.

First, we must compute the projection parameter at the required resolution. The projection consists of the CRS and a CRS transform. The transform consists of 6 parameters: [xScale, xShearing, xTranslation, yShearing, yScale, yTranslation]. We can use the helper function atScale() to calculate the new projection with the correct transform at the given scale.

Once we have the new projection, we can use the reduceResolution() function to aggregate the pixels using the given reducer function. When using the ee.Reducer.sum() reducer for aggregation, it is required to use an unweighted reducer ee.Reducer.sum().unweighted(). The reason for using an unweighted reducer here is due to the inner workings of Earth Engine as explained here and here.

// ****** Aggregation to 1km ****** //

// Get the projection at required scale
var projectionAt1k = projection.atScale(1000)

var pop2015At1k = pop2015
  .reduceResolution({
    reducer: ee.Reducer.sum().unweighted(),
    maxPixels: 1024
  })
  // Request the data at the scale and projection
  // of reduced resolution
  .reproject({
    crs: projectionAt1k
  });

var stats = pop2015At1k.reduceRegion({
  reducer: ee.Reducer.sum(),
  geometry: geometry,
  scale: projectionAt1k.nominalScale(),
  maxPixels: 1e10,
  tileScale: 16})
print(stats.get('population_count')) // 9529826.09191158

Aggregation to Very Large Pixels

There is a limit of 1024 pixels in the reduceResolution function. That means a single step of aggregation can combine upto 1024 pixels in a single pixel. If the difference between source pixel size and target pixel size is large, you may get an error such as “Too many input pixels per output pixel“.

In such cases, you can do the aggregation in multiple steps. Here’s how to aggregate the 250m pixels image into a 10000m (10 km) pixel image.

// ****** Aggregation to 10km ****** //

// Get the projection at required scale
var projectionAt1k = projection.atScale(1000)
var projectionAt10k = projection.atScale(10000)

// Step1: 250m to 1000m
var scaledImage1 = pop2015
  .reduceResolution({
    reducer: ee.Reducer.sum().unweighted(),
    maxPixels: 1024
  })
  // Request the data at the scale and projection
  // of reduced resolution
  .reproject({
    crs: projectionAt1k
  });
// Step2: 1000m to 10000m
var pop2015At10k = scaledImage1
  .reduceResolution({
    reducer: ee.Reducer.sum().unweighted(),
    maxPixels: 1024
  })
  // Request the data at the scale and projection
  // of reduced resolution
  .reproject({
    crs: projectionAt10k
  });
  

Exporting Results

The aggregation operation can be computationally intensive and can time-out while calculating stats or displaying it on the map. The recommended practice is the export the result of the aggregation as an asset. Once the export finishes, you can import the asset into a script and continue to perform your computation. Earth Engine can also re-project the data to a specific CRS during the Export. Below is the code snippet showing how we can export the aggregated data to a UTM projection at 1km resolution.

// Clip and Export the results
// Export the aggregated data for a state
var admin1 = ee.FeatureCollection(
  "FAO/GAUL_SIMPLIFIED_500m/2015/level1");
var karnataka = admin1.filter(
  ee.Filter.eq('ADM1_NAME', 'Karnataka'))
var geometry = karnataka.geometry()

// Source data is in World Mollweide projection
// Reproject it to a Local UTM projection for export
Export.image.toAsset({
  image: pop2015At1k.clip(geometry),
  description: 'Export_to_Asset',
  assetId: 'users/ujavalgandhi/public/karnataka_ghsl1k',
  region: geometry,
  scale: 1000,
  crs: 'EPSG:32643'
})
var exported = ee.Image(
  'users/ujavalgandhi/public/karnataka_ghsl1k')
Map.addLayer(exported,
  {min:500, max: 30000, palette: palette},
  'Exported Asset at 1km')

Here’s the full script with all the steps. If you have questions, do write them in the comments below.

If you are new to Earth Engine and want to master it, check out my course End-to-End Google Earth Engine.

19 Comments

Leave a Comment

  1. In the “AGGREGATION TO VERY LARGE PIXELS” section, why can’t we simply increase the maxPixels and convert from 250m to 10km in a single step?

    • This is the limit imposed by the system to avoid out-of-memory errors. The hard-limit is 1024 pixels per step. 250m to 10000m requires aggregating (40*40) = 1600 pixels into a single one. Earth Engine works by distributing the data processing over multiple machines in parallel. The data required for 1600 pixels to compute 1 output pixel may not fit in a single machine’s RAM. Doing it in multiple steps, allows each step to fit within the memory of the machines.

  2. Is it mandatory to specify reproject() after reduceResolution?
    or are there image operations that will implicitly reproject the reduced image?

    e.g. will image.updateMask(crop_mask) implicitly re-project the crop_mask according to the image’s projection?

    I’d like to use reduceResolution to overlay CDL, which has 30m resolution, as a mask on MODIS, that has 500m resolution.

      • Thank you for the quick reply 🙂

        My concern with the default resampling is that we can’t control the percentage of crop coverage. e.g. if the 500m MODIS pixel is mostly forest but there happens to be a bit of corn in the pixel extent, we shouldn’t consider it.

        I believe the code below will classify a MODIS pixel as covered by crop if the crop percentage > threshold.

        Is this the correct way to implement reduceResolution?

        # Corn cover in 2021 from CDL that has 30m resolution
        crop_cover = (
        ee.ImageCollection(‘USDA/NASS/CDL’).filter(ee.Filter.calendarRange(2021, 2021, “year”))
        .first()
        .select(“cropland”)
        .eq(1)
        )
        # reduce resolution of 30m CDL to 500m MODIS
        crop_cover = crop_cover.reduceResolution(
        reducer=ee.Reducer.mean(), maxPixels=512, bestEffort=True
        ).reproject(
        crs=modis_projection,
        scale=modis_projection.nominalScale()
        # Filter out pixels that are less than threshold fracture of the pixel.
        ).gt(threshold)

    • Thank you for these great instructions! I’m working with a similar worldwide dataset, but I’ve only been able to successfully aggregate data for the first image in the dataset. How were you able to aggregate data for the entire world in your code?

  3. i have to calculate are of each land cover for year 2000, 2005, 2010,2015,2020 and 2023 can you plz give code for shape file

  4. Thank you for the resource. I would like to know opposite direction. If I want to convert a image of spatial resolution of 1000m (e.g., GPWv4) to a 20m resolution image then what would be the function? I want to do this to overlay the population image over the SAR-1 flooded image to count the number of population exposed due to flood event. Since SAR-1 has a 20m spatial resolution so I want to resample the population image spatial resolution to 20m. Thanks in advance for your suggestion.

    • This would be as simple as specifying a 20m scale in the analysis for satellite imagery. GEE will automatically upsample (divide large pixels into smaller pixels of same value) and use those for analysis. This becomes tricky with population raster as you have to divide (disaggregate) the pixels into smaller pixels with fractional value. I can’t think of a straightforward way of doing this in GEE. Sorry.

  5. Thank you for the really helpful post!! I have a problem running the code, hope you can help out! After following the steps you’ve listed, using .reduceResolution() and .reproject(), I got this particular error: Reprojection output too large (25600×25600 pixels). (Error code: 3).

    I had used a 30m resolution raster and wanted to aggregate it to a much, much larger 10km scale. Would the problem lie in the drastic rescaling?

    • That is simply a very large change in resolution. Do it in steps as outlined in the post under the title ‘AGGREGATION TO VERY LARGE PIXELS’ and Export. Do not print and add the result to the map as it will timeout with the error. Export it and import the result. If that fails, export in steps. First export form 30m to 1000m and then 1000m to 10000m.

  6. Hello, thank you for your previous help. I have now been able to aggregate the data to a larger projection for the entire world, but the population estimates for entire cities (using similar code to the stats function in this article) are considerably larger for the lower resolution aggregates than with the original higher resolution. In your code, the difference is much more modest. Is this a case of raster generalization, and is there a way to correct this?

    • Difficult to say without seeing the code. The results should be the same. Any difference will be due to using incorrect scale or bug.

  7. Hi! Thank you so much for this informative blog post! I have the task of a reduced resolution for the mean of an image collection by region, but I am not working with population data but EVI and Albedo imagery. MODIS LST is native 1km, but the modeled EVI/Albedo is 500m. In particular, my question is what is more preferable: a) take the mean of the image collection, reduce the resolution of this mean image, and then reduce regions or b) reduce the resolution of each image in the collection, take the mean of this new collection, and finally reduce regions? Why would one be preferred over the other, or is it not? I appreciate any insight into this quandary.

    • Computing the mean is easy and fast. Reducing resulotion is slow and expensive. So if you take the mean first, GEE has to do reduceResolution on only 1 image compared to every image in the collection. So that would be preferred.

Leave a Reply to PratyushCancel reply