Find Neighbor Polygons using Summary Aggregate Function in QGIS

Read my previous posts Summary Aggregate and Spatial Filters and Advanced Aggregate Expressions to Automate QA to learn more about the powerful aggregate function.

The aggregate function in QGIS was designed to work with 2 separate input vector layers, but we can also make it work with a single layer. Essentially performing spatial queries for features within the layer.

Consider this problem. In a layer containing zip codes, we want to find all neighboring zip codes for every polygon. This can be done very easily by creating a new field with an expression with the aggregate function – using the layer itself as the parent layer.

Below is the Zip Codes shapefile from the City of Seattle Open Data Portal. Our task is to add another field which contains all neighboring zip codes for each feature. Note the layer name is Zip_Codes

Open the Attribute Table for the layer. Note the field name containing the 5-digit zip code is ZIPCODE . Open Field Calculator.

Create a new field with the following expression. The key here is the spatial filter intersects($geometry, geometry(@parent)) that finds all polygons from the layer that touch the feature being processed. (Note: An earlier version of this post used touches filter instead of intersects)

aggregate(
 layer:= 'Zip_Codes',
 aggregate:='concatenate',
 expression:=ZIPCODE,
 concatenator:=', ',
 filter:=intersects($geometry, geometry(@parent))
 )

Note: If you are not getting accurate results, try the following

  • Add a small buffer to the geometry. Change the filter statement to something like filter:=intersects($geometry, buffer(geometry(@parent), 1)) where 1 is the distance in the units of your layer’s CRS. Change it to an appropriate small number for your CRS units.
  • Instead of aggregating using the same layer, duplicate the layer by right-clicking the layer and selecting ‘Duplicate’. And when running the aggregate function, use the duplicate layer name instead of the original layer.

You will now have a new field that has the neighbors zip codes for every feature in the layer!

There are many type of aggregate that you can use. If you want to count numbering zip codes, you can use the count aggregate instead of concatenate. The following expression would calculate the number of neighboring polygons.

aggregate(
 layer:= 'Zip_Codes',
 aggregate:='count',
 expression:=$id,
 filter:=touches($geometry, geometry(@parent))
 )

23 Comments

Leave a Comment

  1. Hi Ujaval, thanks for sharing this post.

    I have tried your example with the world borders shapefile (https://thematicmapping.org/downloads/world_borders.php) and QGIS returns some strange results (e.g. “Cuba” touches “USA”?!! and other countries that share a common border but they don’t appear in the results).

    Any suggestions to overcome this problem? By the way, I am using QGIS 3.4.7. The code I used can be seen below:

    aggregate(
    layer:= ‘TM_WORLD_BORDERS-0.3′,
    aggregate:=’concatenate’,
    expression:=NAME,
    concatenator:=’, ‘,
    filter:=touches($geometry, geometry(@parent))
    )

    ps1. When I use the count function, it returns zero for all the countries.
    ps2. When I use the Zip Codes shapefile, the same issue appears (e.g. in the zip code 98406, where should be 6 neighbors, but only 3 appear).

    Like

  2. Thanks for the super helpful article!

    For anyone else who might be using a shapefile that has polygons which theoretically *should* touch but which do not actually share points (which is what the `touches` function expects), you can use this trick: first, select all your features, then select Vector > Geoprocessing Tools > Buffer and add a small buffer (e.g. 1m, but adjust as you need). This will cause the polygons to expand in size by 1m, and, hence, overlap. You can then use the aggregation code in the article, but replace the `touches` function with `intersects`, since adjacent features now intersect each other: `filter:=intersects($geometry, geometry(@parent))`.

    NB: this will cause polygons that are “kitty corner” to each other (e.g. Utah and New Mexico, to use the example of US states) to show up as adjacent. For my purposes, this is not a problem; it might be for you.

    Like

  3. Hi Ujaval,
    thanks for this, because it is the closest I was able to find as a solution to my problem, nevertheless it comes up with empty cells.
    So, I have a grid of my study area, and I want to look into the autocorrelation between the neighbouring quadrants. In my case the field code is the id of the quadrants and I’m was expecting a string of respective neighbouring ids, yet this – seemingly ideal – solution of yours results NULL.
    Is there something I miss?
    Or is this not a solution for my problem?

    Like

      • Cheers. It works now!!!
        Though not flawless: in the middle of the grid it sometimes produces 8, sometimes less neighbouring cells. Tried increasing buffer size, did not solved anythin.
        Any suggestion?
        Once again thank you for your help.

        Like

    • I found a solution. Instead of using the aggregate function on the same layer, the process gives correct results if you duplicate the layer and run the computation.
      1. Rename your layer to ‘original’.
      2. Right-click and select ‘Duplicate’. Rename it to ‘copy’.
      3. Add a attribute to the ‘copy’ layer with the following expression. I made a change to remove the id of the polygon from the list of neighbors.

      array_to_string(array_remove_all(string_to_array(aggregate(
      layer:= 'original',
      aggregate:='concatenate',
      expression:=id2,
      concatenator:=',',
      filter:=intersects($geometry, geometry(@parent))
      ), ','), "id2" ), ',')

      See the result at https://drive.google.com/file/d/16_QRQAfN-tMWFrutETGwp2m_o0dHsvKe/view?usp=sharing

      Like

  4. Hello Chinmay! This process is very slow in computing the number neighbouring polygons. I have around 41000 features and it looks like it will take forever to do that!

    Like

  5. Hi there, I am not sure that the method is correct. Looking at your first screenshot, ZIP 98020 code has 3 (or maybe 4) neighbors – 98346, 98177, 98026 (and maybe 98133). However, your formula in the fourth and fifth screenshot show as a result for ZIP 98020 only 2 neighbors. Why is that? I noticed this since in my test data in same cases I get a result of 0 neighbors, while the investigated polygon clearly has a neighbor. Altering the intersect/touch or adding a buffer did not work for me. Any clues?

    Like

    • Hi Mike – you are correct. I think ‘touches’ is very sensitive and works only in cases where polygons actually share the boundary. I have subsequently found that intersects works better. Try replacing touches with intersects and see if it helps. I will edit the post with the same.

      Also the following might work if the polygons do not intersect as well. Buffer the geometry slightly (here 1 is the distance in the layer CRS, so change accordingly to a small number for your layer)

      filter:=intersects($geometry, buffer(geometry(@parent), 1)).

      Like

  6. Hi Ujaval,

    I’m trying to do something very similar but I’m only looking for pairs along a provincial border. Ideally, I would love QGIS to return the neighbours that are only across a specific border. For example, for a specific census division in British Columbia, I want it to return all of the postal codes that are neighbouring it in Alberta. In my data, I have provincial identifiers as well as census division identifiers. Here’s a link to the data if you need to have a look https://www12.statcan.gc.ca/census-recensement/alternative_alternatif.cfm?l=eng&dispext=zip&teng=lcsd000a16a_e.zip&k=%20%20%20%2023264&loc=http://www12.statcan.gc.ca/census-recensement/2011/geo/bound-limit/files-fichiers/2016/lcsd000a16a_e.zip

    Do you have any advice on how I could accomplish this? I’m still fairly new to QGIS and I’ve been trying out some things but with no luck! I’m certain there is a simple line of code I could use. Thanks for any advice you can give!

    Like

    • Hi Jerome,

      This is doable. In the aggregate function, you have a ‘filter’ expression. There you can add a condition to select features that match a census division. Something like

      aggregate(
      layer:= ‘Zip_Codes’,
      aggregate:=’concatenate’,
      expression:=ZIPCODE,
      concatenator:=’, ‘,
      filter:=intersects($geometry, geometry(@parent)) AND PRNAME = ‘Alberta’
      )

      In the dataset you sent, there are no zipcode column. So I am assuming that is a separate layer. You will have to do a table join to get the census division and zipcodes in the same layer. If you can share the zipcode data, I can give you more detailed instructions.

      Like

      • Hi ujaval,

        I wrote a reply on here about a week ago, but I haven’t seen that comment appear on this webpage. So, if there are two similar replies to this, please ignore this one!

        Anyways, I maybe should have been more clear on what I am trying to accomplish. Basically, I would like to get a string variable for each observation that contains a list of all of that observation’s neighbours that are in another province. So for example, the “neighbours” variable for Toronto (which is far from the provincial border) should be blank. Whereas the “neighbours” variable for Ottawa (which is on the Ontario-Quebec border) should have a list of all of the polygons in Quebec that neighbour it, and none of it’s Ontario neighbours. In other words, if you look at the data I posted above and go to Ottawa (CSDUID = 3506008), the “neighbours” variable for Ottawa should contain “2481017, 2480060, 2482030, 2484005” wheres the “neighbours” variable for Toronto (CSDUID = 3520005) should be empty. I hope this makes sense.

        Just for added clarity, I’m essentially replicating this study (https://escholarship.org/uc/item/86w5m90m) with Canadian data. Here is a map they provided in their paper in case that makes things clearer (https://www.mitpressjournals.org/action/showImage?doi=10.1162/REST_a_00039&iName=master.img-001.jpg&w=361&h=257)

        Also, I tried adding the ” AND PRNAME = ‘Alberta’ ” and I kept getting null results whether I added a buffer or not. I also tried AND PRUID = 10 and also got null results. I’m not sure what the issue is there?!

        Thanks again for all of your help, I really appreciate it!!

        Like

  7. Hi Ujaval,

    Thank you very much for this.

    Do you know if it is possible to do this kind of aggregation but limiting it to some kind of condition. E.g. I want to form aggregate neighbouring blocks but limiting the population inside this new conglomerate to 1000.

    We have a block layer and of course a variable with the population in it.

    Thank you in advance for your response or any workabout around this!!

    Braulio

    Like

  8. Hi Ujuval,
    Thank you for the perfect solution to my problem! Unfortunately, I cant seem to get it to work. I am using the following code as instructed:
    aggregate(
    layer:= ‘ndmMjrt3RimboPgn1m’,
    aggregate:=’concatenate’,
    expression:=gridcode,
    concatenator:=’, ‘,
    filter:=intersects($geometry, geometry(@parent))
    )

    But my output is Null.

    I have tried it with and without a buffer border.

    Here are the data. Thank you for the helpin advance.

    https://drive.google.com/drive/folders/10tp7ePPIP8Uag8UteWGOLKo2O0_MvO0O?usp=sharing

    Kind Regards
    Nedim

    Like

    • Hi Nedim,

      You are using the ‘concatenate’ aggregate to create. a comma separated string. It expectes the inputs to be strings too, but your column is a number. So you can convert it to a string and it will work. Use the function to_string() in the expression and it will take care of the conversion. You may also want to use concatenate_unique which will remove the duplicates for you. Below is the expression that works.

      aggregate(
      layer:= ‘ndmMjrt3RimboPgn’,
      aggregate:=’concatenate_unique’,
      expression:=to_string(gridcode),
      concatenator:=’, ‘,
      filter:=intersects($geometry, geometry(@parent))
      )

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.