Temporal Gap-Filling with Linear Interpolation in GEE

Many applications require replacing missing pixels in an image with an interpolated value from its temporal neighbours. This gap-filling technique is used in several applications, including:

  • Replacing Cloudy Pixels: You may want to fill gap in an image with the best-estimated value from the before and after cloud-free pixel.
  • Estimating Intermediate Values: You can use this technique to compute an image for a previously unknown time-step. If you had population rasters at 2 different years and want to compute a population raster for an intermediate year using pixel-wise linear interpolation.
  • Preparing Data for Regression: All of your independent variables may not be available at the same temporal resolution. You can harmonize various dataset by generating interpolated raster at uniform or fixed time-steps.

Google Earth Engine can be used effectively for gap-filling time-series datasets. While the logic for linear interpolation is fairly straightforward, data preparation for this in GEE can be quite challenging. It involves use of Joins, Masks and Advanced Filters. This post explains the steps with code snippets and builds a fully functioning script that can be applied on any time-series data.

You can also watch my video at Geo for Good 2022 Conference where I covered the same topic.

Overview Of Steps

Interpolating an Image Collection in GEE involves the following steps

  • Adding a time-band to each image
  • Joining the image collection with itself to find images before and after each image
  • Processing the result to create interpolated images and replacing masked pixels with interpolated values.
  • Visualizing/Exporting the results

Example 1: Gap-Filling MAsked NDVI Pixels

We will see how to apply this technique on a Sentinel-2 Time-Series and replace cloudy pixels with interpolated values from images taken before and after it. The image below shows the result of interpolation.

Original vs. Interpolated NDVI Time Series

We start by preparing a time-series for 1-year by filtering the Sentinel-2 Level-2A collection and applying a cloud-masking function. The code below uses a polygon for a farm as a geometry. You may replace it with the geometry for your area of interest.

var geometry = ee.Geometry.Polygon([[
  [82.60642647743225, 27.16350437805251],
  [82.60984897613525, 27.1618529901377],
  [82.61088967323303, 27.163695288375266],
  [82.60757446289062, 27.16517483230927]
]]); 
Map.addLayer(geometry, {color: 'red'}, 'Farm')
Map.centerObject(geometry)
var s2 = ee.ImageCollection("COPERNICUS/S2_SR");
var filtered = s2
  .filter(ee.Filter.date('2019-01-01', '2020-01-01'))
  .filter(ee.Filter.lt('CLOUDY_PIXEL_PERCENTAGE', 30))
  .filter(ee.Filter.bounds(geometry))
// Write a function for Cloud masking
function maskCloudAndShadowsSR(image) {
  var cloudProb = image.select('MSK_CLDPRB');
  var snowProb = image.select('MSK_SNWPRB');
  var cloud = cloudProb.lt(5);
  var snow = snowProb.lt(5);
  var scl = image.select('SCL'); 
  var shadow = scl.eq(3); // 3 = cloud shadow
  var cirrus = scl.eq(10); // 10 = cirrus
  // Cloud probability less than 5% or cloud shadow classification
  var mask = (cloud.and(snow)).and(cirrus.neq(1)).and(shadow.neq(1));
  return image.updateMask(mask).divide(10000)
      .select("B.*")
      .copyProperties(image, ["system:time_start"]);
}
var filtered = filtered.map(maskCloudAndShadowsSR)

Add a band containing timestamp to each image. This will be used to do pixel-wise interpolation later. The key here is to mask the timestamp band with the same mask as the image. If your image has more than 1-band, the code below will use the mask from the first band.

var filtered = filtered.map(function(image) {
  var timeImage = image.metadata('system:time_start').rename('timestamp')
  var timeImageMasked = timeImage.updateMask(image.mask().select(0))
  return image.addBands(timeImageMasked)
})

Next we specify the time-window. This will determine how much backward and forward are we willing to look for an unmasked pixel in the time-series. The filters in subsequent steps need the time difference in milliseconds, so we convert days to milliseconds.

var days = 30
var millis = ee.Number(days).multiply(1000*60*60*24)

For each image in the collection, we need to find all images before and after the specified time-window. This is accomplished using Joins. We need to do 2 joins

  • Join 1: Join the collection with itself to find all images before each image
  • Join 2: Join the collection with itself to find all images after each image

We first define the filters needed for the join. Define a maxDifference filter to find all images within the specified days.

var maxDiffFilter = ee.Filter.maxDifference({
  difference: millis,
  leftField: 'system:time_start',
  rightField: 'system:time_start'
})

We need 2 more filters.

  • A lessThanOrEquals filter to find all images after a given image. This will compare the given image’s timestamp against other images’ timestamps
  • A greaterThanOrEquals filter to find all images before a given image. This will compare the given image’s timestamp against other images’ timestamps
var lessEqFilter = ee.Filter.lessThanOrEquals({
  leftField: 'system:time_start',
  rightField: 'system:time_start'
})
var greaterEqFilter = ee.Filter.greaterThanOrEquals({
  leftField: 'system:time_start',
  rightField: 'system:time_start'
})

Now we apply the joins. We join the image collection with itself and match before/after images for each image in the collection that meet our criteria.

Join1 is to find all images after each image that are winthin the time-window. We have 2 filters already defined for these two conditions. We combine them and apply the join. This join will find all images after, sorted in descending order. This will gives us resulting images such that closest image is last.

var filter1 = ee.Filter.and(maxDiffFilter, lessEqFilter)

var join1 = ee.Join.saveAll({
  matchesKey: 'after',
  ordering: 'system:time_start',
  ascending: false})
  
var join1Result = join1.apply({
  primary: filtered,
  secondary: filtered,
  condition: filter1
})

Do the Join2 now to match all images within the time-window that come before each image. This join will find all images before, sorted in ascending order. This will gives us images so that closest is last.

var filter2 = ee.Filter.and(maxDiffFilter, greaterEqFilter)

var join2 = ee.Join.saveAll({
  matchesKey: 'before',
  ordering: 'system:time_start',
  ascending: true})
  
var join2Result = join2.apply({
  primary: join1Result,
  secondary: join1Result,
  condition: filter2
})

The collection is now ready for interpolation. We write a function that will be used to interpolate all images. This function takes an image and replaces the masked pixels with the interpolated value from before and after images.

Interpolation Formula
y = y1 + (y2-y1)*((t – t1) / (t2 – t1))
y = interpolated image
y1 = before image
y2 = after image
t = interpolation timestamp
t1 = before image timestamp
t2 = after image timestamp

Tip:

If you wanted a simple average of before and after images, you can set the value of timeRatio to 0.5.

var interpolateImages = function(image) {
  var image = ee.Image(image)
  // We get the list of before and after images from the image property
  // Mosaic the images so we a before and after image with the closest unmasked pixel
  var beforeImages = ee.List(image.get('before'))
  var beforeMosaic = ee.ImageCollection.fromImages(beforeImages).mosaic()
  var afterImages = ee.List(image.get('after'))
  var afterMosaic = ee.ImageCollection.fromImages(afterImages).mosaic()
  // Get image with before and after times
  var t1 = beforeMosaic.select('timestamp').rename('t1')
  var t2 = afterMosaic.select('timestamp').rename('t2')
  var t = image.metadata('system:time_start').rename('t')
  var timeImage = ee.Image.cat([t1, t2, t])
  var timeRatio = timeImage.expression('(t - t1) / (t2 - t1)', {
    't': timeImage.select('t'),
    't1': timeImage.select('t1'),
    't2': timeImage.select('t2'),
  })
  // Compute an image with the interpolated image y
  var interpolated = beforeMosaic
    .add((afterMosaic.subtract(beforeMosaic).multiply(timeRatio)))
  // Replace the masked pixels in the current image with the average value
  var result = image.unmask(interpolated)
  return result.copyProperties(image, ['system:time_start'])
}

Now map() the function to interpolate all images in the collection. The interpolatedCol is a continuous time-series with cloudy pixels replaced by interpolated values from closest cloud-free pixels.

var interpolatedCol = ee.ImageCollection(
  join2Result.map(interpolateImages))

Here’s the full script along with code for computing NDVI and creating an animation showing the gap-filled collection. https://code.earthengine.google.co.in/536d8fc3b031365e6d4f75d38131682e

Example 2: Interpolating Population Rasters

Another useful application of this technique is to generate data at missing time intervals. We will take the GHSL: Global Human Settlement Layers Population Grid at year 2000 and 2015 and interpolate it to create population grids from year 2005 and 2010. Below image shows the results we are trying to achieve.

Original GHSL Population Grids and Interpolated Values

We can follow the same process as the previous example with a small change. The previous example already had images at all the time-steps that we needed and we only had to fill the masked pixels with interpolated values. Here, we need prepare our collection so it has empty (fully masked) images for the years that we want to generate new population grids. The code will then fill all the pixels with interpolated values.

var pop2000 = ee.Image('JRC/GHSL/P2016/POP_GPW_GLOBE_V1/2000')
var pop2015 = ee.Image('JRC/GHSL/P2016/POP_GPW_GLOBE_V1/2015')
var ghsl = ee.ImageCollection.fromImages([pop2000, pop2015])
var yearsToInterpolate = ee.List([2005, 2010]);
var initImages = yearsToInterpolate.map(function(year) {
  var image = ee.Image().rename('population_count').toFloat().set({
    'system:index': ee.Number(year).format(),
    'system:time_start': ee.Date.fromYMD(year, 1, 1). millis(),
  })
  return image
})
var initCol = ee.ImageCollection.fromImages(initImages)
var mergedCol = ghsl.merge(initCol)
var mergedCol = mergedCol.map(function(image) {
  var timeImage = image.metadata('system:time_start').rename('timestamp')
  var timeImageMasked = timeImage.updateMask(image.mask().select(0))
  return image.addBands(timeImageMasked)
})

Next, we need to choose a time-window. We have population images 15 years apart, so we set time-window to be 15 x 365 days.

var days = 15*365
var millis = ee.Number(days).multiply(1000*60*60*24)

Rest of the code for interpolation remains the same. We just need to change the variable name for the collection in join1.

var maxDiffFilter = ee.Filter.maxDifference({
  difference: millis,
  leftField: 'system:time_start',
  rightField: 'system:time_start'
})
var lessEqFilter = ee.Filter.lessThanOrEquals({
  leftField: 'system:time_start',
  rightField: 'system:time_start'
})
var greaterEqFilter = ee.Filter.greaterThanOrEquals({
  leftField: 'system:time_start',
  rightField: 'system:time_start'
})
var filter1 = ee.Filter.and(maxDiffFilter, lessEqFilter)
var join1 = ee.Join.saveAll({
  matchesKey: 'after',
  ordering: 'system:time_start',
  ascending: false})
  
var join1Result = join1.apply({
  primary: mergedCol,
  secondary: mergedCol,
  condition: filter1
})
var filter2 = ee.Filter.and(maxDiffFilter, greaterEqFilter)
var join2 = ee.Join.saveAll({
  matchesKey: 'before',
  ordering: 'system:time_start',
  ascending: true})
  
var join2Result = join2.apply({
  primary: join1Result,
  secondary: join1Result,
  condition: filter2
})
var interpolateImages = function(image) {
  var image = ee.Image(image)
  var beforeImages = ee.List(image.get('before'))
  var beforeMosaic = ee.ImageCollection.fromImages(beforeImages).mosaic()
  var afterImages = ee.List(image.get('after'))
  var afterMosaic = ee.ImageCollection.fromImages(afterImages).mosaic()
  var t1 = beforeMosaic.select('timestamp').rename('t1')
  var t2 = afterMosaic.select('timestamp').rename('t2')
  var t = image.metadata('system:time_start').rename('t')
  var timeImage = ee.Image.cat([t1, t2, t])
  var timeRatio = timeImage.expression('(t - t1) / (t2 - t1)', {
    't': timeImage.select('t'),
    't1': timeImage.select('t1'),
    't2': timeImage.select('t2'),
  })
  var interpolated = beforeMosaic
    .add((afterMosaic.subtract(beforeMosaic).multiply(timeRatio)))
  var result = image.unmask(interpolated)
  return result.copyProperties(image, ['system:time_start'])
}
var interpolatedCol = ee.ImageCollection(join2Result.map(interpolateImages))

The interpolated collection is ready. We can visualize the images by adding each of them to the Map. Here we are using the boundary for City of Johannesburg, South Africa and clipping each image to it.

// Add a 'year' property to each image so we can filter them easily
var finalCol = interpolatedCol.map(function(image) {
  var date = ee.Date(image.get('system:time_start'))
  return image.set('year', date.get('year'))
})
// Clip and Visualize the Images for a City
var johannesburg = ee.FeatureCollection('users/ujavalgandhi/public/COJ_Boundary')
var geometry = johannesburg.geometry()
// Select the population band and clip to the geometry
var pop2000Image = pop2000.select('population_count').clip(geometry)
var pop2015Image = pop2015.select('population_count').clip(geometry)
var pop2005Image = ee.Image(finalCol.filter(ee.Filter.eq('year', 2005)).first())
  .select('population_count').clip(geometry)
var pop2010Image = ee.Image(finalCol.filter(ee.Filter.eq('year', 2010)).first())
  .select('population_count').clip(geometry)
// Visualize on the Map
var palette = ['#fef0d9','#fdcc8a','#fc8d59','#e34a33','#b30000']
var popVisParams = {min:30, max:1500, palette: palette}
Map.centerObject(geometry)
Map.addLayer(pop2000Image, popVisParams, '2000')
Map.addLayer(pop2005Image, popVisParams, '2005')
Map.addLayer(pop2010Image, popVisParams, '2010')
Map.addLayer(pop2015Image, popVisParams, '2015')

You will find the full script for this example at https://code.earthengine.google.co.in/612e5b4dbc08995c9455a5fc7e7b1f55

This was one of the most frequently asked questions by many users and there was no open-source implementation that was readily available to use. I spent a significant amount of time developing this code and writing this post to help researchers apply this to their work. Do leave a comment if you found it useful or have ideas to improve the code.

If you are new to Earth Engine and want to master it, check out my course End-to-End Google Earth Engine.

18 Comments

Leave a Comment

  1. Wow what a rare and valuable piece of code.

    How would I adjust the code of example 2 to create empty (fully masked images) for specific dates where data is missing?
    Something like var monthsToInterpolate = ee.List([20020601,20020701,20030401]);

    And then interpolate using the average of the previous month and sequential month’s reading? Or something like that.

    cant seem to get this right.
    Any insight will be greatly appreciated.

    • Thanks Cidy. Glad you found it useful. Below is a snippet to create images for the specific dates.

      var monthsToInterpolate = ee.List(
      [ee.Date.fromYMD(2002, 6, 1),
      ee.Date.fromYMD(2002, 7, 1),
      ee.Date.fromYMD(2003, 4, 1)])

      var initImages = monthsToInterpolate.map(function(date) {
      var year = ee.Date(date).get(‘year’).format();
      var month = ee.Date(date).get(‘month’).format(‘%02d’);
      var imageId = year.cat(month);
      var image = ee.Image().rename(‘monnthlyimage’).toFloat().set({
      ‘system:index’: imageId,
      ‘system:time_start’: ee.Date(date). millis(),
      })
      return image
      })
      var initCol = ee.ImageCollection.fromImages(initImages)
      print(initCol)

      https://code.earthengine.google.co.in/88ceb72935c3e50e8c28f08f5eb641a4

      You can adjust the system:index format as per your requirement.

  2. Big thanks and have been following you for the last two years and your contribution is huge. Godbless you and your family

  3. Thanks for so much! Your code is very useful for us to apply in many regions of Brazil where the incidence of clouds is high.

  4. Thanks! It is very useful code.
    I have a question related to example1 regarding the first and last images which have some nan values due to the lack of information before or after. How can I handle this problem.

    • The only solution is to increase the time period so there is an unmasked pixel before/after the image within that period. But increasing it too much can lead to filling hte pixel values from a different season. So you’ll have to use an appropriate value for your use case.

      • Thanks for your response, I appreciate that.
        Also, I have another question. Is it normal to be more missing pixels after interpolation of the first image compared to the original image?

  5. Hello! thanks for the tutorial. amazing.
    However, i’m having some kind of problem here. Basically when i select a random image from the filteredcollection it gives me images with null values (holes), where it should be filled, right?
    thanks!!!

    • Likely there were no cloud-free pixels in the range you selected. If you selected 30-day time-period and there were no cloud-free pixels during that time, you will still get holes.

  6. Sir after this Gap filling process my collection is getting heavy i can’t use this collection for classification purposes due to memory size being exceeded what is the solution

  7. Thanks for so much! Your tutorial is amazing and very useful for Me.
    However, I face some problems, after a gap-filling process, my collection is getting heavy. I can’t use this collection for classification purposes due to memory size being exceeded what is the solution? plz guide me

  8. Hello, hope you are doing good,
    thanks for the amazing tutorial.
    i have a question concerning the geospatiel pixel neighborhood interpolation technique, i wanted to use it to fill kandsat 7 gaps dues to the SLC error, but i couldn’t find any code source to do so in the GEE.
    thanks for taking time to read my message.

Leave a Reply to Benazir MeerashaCancel reply