Summary Aggregate and Spatial Filters in QGIS

QGIS expression engine has a powerful a summary aggregate function that can do spatial joins on the fly. This enables some very interesting uses.

One such use is to enable faster and more accurate data editing. For example, when you are digitizing a new feature and want to auto-populate a field based on its relationship with another layer, or want to restrict user input based on a spatial query.

To demonstrate this feature, I downloaded the land parcels and related data layers from City of San Francisco Public Data Portal.

Problem 1: Auto-populate a field in the Parcels layer from the Zoning layer

The city is divided into multiple non-overlapping zoning districts. Rather than entering the zone data manually, we can setup the attribute form to populate it automatically. Open the Layer Properties dialog for the parcels layer and switch to the Attribute Forms tab. Select the field that you want to auto-populate and un-check the Editable button. We will now enter an expression using the aggregate function as the Default value.

Enter the following expression. You can use any of the geometry functions (intersects, within etc.) to setup the filter.

aggregate(
 layer:= 'zoning',
 aggregate:='concatenate',
 expression:=zoning_sim,
 concatenator:=', ',
 filter:=contains($geometry, geometry(@parent))
 )

This expression means that we want to query the zoning layer and fetch the value of the zoning_sim field. The aggregate will be calculated using only features which pass the filter criteria defined in the filter parameter. Here we are defining a spatial filter to get only the feature that contains the polygon we just digitized. Here geometry(@parent) refers to the digitized feature and $geometry refers to the geometry of features from the zoning layer.

Now you can start digitizing. As you add polygons, the value of the zoning_sim will be fetched from the intersecting zoning layer. This is done on-the-fly and can save a lot of effort in manually entering the correct value.

(Click the image to see hi-res version)

Problem 2: Restrict the choice of street name to streets close to the digitized parcel

We have another field called street in the parcel layer that contains the frontage street name. It will be useful to get the canonical street names from a street layer rather than typing it manually. Even better would be to allow the user to pick from the streets that are close to the lot.

Open the Layer Properties dialog for the parcels layer and switch to the Attribute Forms tab. Select the street name field and choose the Value Relation widget. We can setup this field to lookup the names from the street attribute in the streets layer. The trick is to enter a spatial filter expression to select the roads that are close to the digitized geometry. You can use an expression such as below.

intersects($geometry, buffer(@current_geometry, 0.0005)) 

Now as you digitize the parcels, nearby street names will be populated in the drop-down box and presented to the user. There are duplicates because of multiple line segments of the same street are present within the search distance.


(Click the image to see hi-res version)

Value Relation widget is pretty cool. You can check out this post by Randal Hale to see how to do drill-down forms using them.

You can learn more about aggregate functions in the QGIS Documentation. Special thanks to Oto Kaláb whose answer on gis.se inspired this post.

Update: Lene Fischer asked me on twitter how can one write an expression to pick up an attribute from the feature with the largest area among all intersecting features. QGIS 3.8 added support for an additional filter operation called order_by in aggregate functions. Using this allows us to pick an attribute from the feature with the largest area among all intersecting feature.

array_last(aggregate(
 layer:= 'zoning',
 aggregate:='array_agg',
 expression:=zoning_sim,
 filter:=intersects($geometry, geometry(@parent)),
 order_by:=area(intersection($geometry, geometry(@parent)))
 ))

Check out my other posts on aggregate() expressions to see other applications of this very powerful function.

I also did a talk explaining aggregate functions in depth. Check out the video below for more examples.

35 Comments

Leave a Comment

  1. Hi Ujaval, thanks a lot for sharing this. I will use it in my next “qgis for experienced users” workshop!

  2. Could you write a little about how your espressão (code) works that picks up an attribute of the resource with the largest area among all the intersecting resources? I can not deploy …
    Your work is incredible, congratulations.

    • See the update2 which has a simpler expression and is easier to understand. The sort_by key helps sort the filtered features according to the area of intersection and array_last picks the value with the highest intersection area.

  3. Fabulous. Thanks for sharing. I don’t know if it would be possible with this function to capture the area superimposed on a polygon in another layer?

  4. Hi ujaval,
    Good we solve the second problem using the aggregate function?

    Greatings Thomas

  5. I have an address dataset that I want to autopopulate the address number(stnum) then the street name9label) then the emergency service number (ESN) for the full address when I set up the value relation widget all my sample valures check out so I know the syntax is correct i.e 123 wilson rd 002 but when I add a new address point the esn populates the intersecting layer correctly then I add the address number and select the closest road similar to your street example, but the full address doesn’t populate 123 wilson rd 002 the address field is blank why is that what could I be doing wrong?

    • How are you determining the address? Usually address numbers are interpolated and there are geocoding methods to get those. Street should work though. Can you share your expression and maybe a screenshot showing your setup?

      • I have to manually input the address but if I could find a reverse geocode that worked in the field without an internet connection that would be great. the esn auto populates from an intersecting layer, and I can do the same with the zipcode and city fields. But I don’t know where I am going wrong. How do I attach a screenshot to this post?

  6. Hello Sir,
    Thank you for sharing very useful reference.
    I have a cadastral parcel of polygon features. How can i write the expression to select all the isolated polygons from the same feature with following conditions:
    1. Must be greater than 10 m away from the nearest polygon certain area limit.
    2. selected polygon must have area smaller than 100 sq m.

    (Choki)
    From Bhutan.

      • My bad. This will not work. You will need an aggregate expression. Testing on a dataset, the following expression will work

        Use ‘Extract by Expression’ algorithm. Make sure your layer is in a Projected CRS with meters as units. This expression tests whether the polygon only intersects with itself and no other polygon within 10 meters and its area is less than 100 sqm

        aggregate(
        layer:=@layer,
        aggregate:=’count’,
        expression:=$id,
        filter:=intersects($geometry, buffer(geometry(@parent), 10))) = 1 and $area < 100

  7. It is really interesting! Can I use aggregate also between the polygon and line layers? To define if the line has intersect with polygon? How many intersections and based of the count I classify my line layer. I tried to followed your suggestion but not succesful!

    • I think you can use the new overlay_intersect() function in QGIS 3.16. That will tell you if the line intersects with any of the polygons.

  8. Hello ujaval, thanks for your interesting description. The attributes that I need for the new polygon layer are in a point layer. What is the expression (point within polygon)?

  9. ..with this expression it works.

    aggregate(
    layer:= ‘GWR_Kriechenwil’,
    aggregate:=’concatenate’,
    expression:=DEINR,
    concatenator:=’, ‘,
    filter:= within($geometry, geometry(@parent))
    )

    However, only if the attribute is available as a “string”.

  10. Perfect! Now the numerical values ​​are also written. Thank you very much, you are my hero.

  11. Hey there. This post is gold. Helped me a lot. But i have one final question:
    You found a solution for ordering intersecting polygons. Is it also possible with lines? Seems like overlay_intersects has no order by.

  12. Thank you for sharing, it’s very helpful. Would you mind to share summary aggregate for SQL
    especially Postgresql ?

  13. What if you want to perform a conditional multi conditional aggregate arrangement.. Let’s say you want a layer to auto calculate values of more than one features that intersect with it

  14. Hi Ujaval, Is it possible to show us, how to take some specific attribute of a point layer into the line layers? (I want to feed the point attribute data to my start and end point of the line layer, creating two columns such as “From” and “To”)
    I am searching for this sort of expression or way long time, but unfortunately I did not find one yet, what you showed us in this example is very helpful, but i am unable to get the expression for my need.

    • There is a tool in the Processing Toolbox called ‘Points to Path’. That should do what you need without need of a custom expression.

  15. Hi Ujaval, great post. I`m also following you in Twitter. I have a question with aggregate. Working with a layer (layer A) i need to retrieve the value of an attribute of the selected feature in a second layer (layer B). i use that to filter features in layer A which match this value. I tried using the aggregate function but i need to pass the correct expression in filter section, but i´m still stuck with it.

    • Hi Juan – if the field in Layer A is called “A” and field in Layer B is called “B” you can try something like "B" = attribute(@parent, 'A')

      Let me know if that helped.

Leave a Reply