In a previous post, I showed how to use the aggregate function to find neighbor polygons using QGIS. Using aggregate functions on the same layer allows us to easily do geoprocessing operations between features of a layer. This is very useful in many analysis that would typically require writing custom python scripts.
Here I demonstrate another powerful function
array_foreach that allows one to iterate over other features in QGIS expressions – enabling even more powerful analysis by writing just a single expression.
Here’s the problem statement. Given a polygon layer – such as US States – calculate what is the length of shared border between each polygon.
This will first require us to find out which states share border with each other and then calculate length of their intersection. For a pair of polygons – this can be done manually or by putting them in separate layers and intersecting them. But since we have 50 states – doing it for each pair of states manually is not possible. But with help of the QGIS expression engine, it can be done in using built-in functions without any custom code.
If you want to follow along, download the states shapefile cb_2018_us_state_500k.zip provided by the United States Census Bureau.
Locate the downloaded file in the Browser panel. Expand it and drag the cb_2018_us_state_500k.shp to the canvas. First we need to prepare the layer. Since the exercise involves calculating lengths, the layer needs to be in a projected CRS. (While QGIS can perform ellipsoidal length calculations for layers in geographic CRSs, the
length function – which we need – can only uses layer CRS for calculation). From Processing Toolbox, search and locate the ‘Reproject layer’ algorithm.
Set the Target CRS to ‘EPSG:5070 NAD83 Conus Albers’. This is a general purpose projection with minimal distortions in the continental united states with meters as units. Once the processing finishes and the new ‘Reprojected’ layer loads, change the project CRS also to EPSG:5070. Next, search and locate the ‘Field Calculator’ algorithm.
We will now add a new string field called ‘neighbors’ which will have a comma-separated list of all neighboring states. The state code is contained in the column STUSPS. So we can use the aggregate function
array_agg expression to find all intersecting features and extract the state code. The resulting list has the state itself since the geometry intersects with itself. We use the
array_remove_all function to remove the code of the state being evaluated from the results.
array_to_string(array_remove_all(aggregate( layer:= 'Reprojected', aggregate:='array_agg', expression:=STUSPS, filter:=intersects($geometry, geometry(@parent)) ), "STUSPS"))
A new layer ‘Calculated’ will load once the processing finishes. This layer now has a ‘neighbor’ column which has a list of all neighboring states for each feature.
Now we are ready to calculate the border lengths. The idea is to intersect each feature with all the neighboring features and calculate the length of the intersection. But the features are polygons and since they just touch each other, and the intersection of them will be empty. So the trick is to first convert the polygon features to polylines. The intersection of the 2 neighboring polyline features will be a polyline representing the border. You may be tempted to use the ‘Polygons to lines’ algorithm, but this algorithm assigns the shared boundary between the polygons to only one of the resulting features. The correct algorithm is ‘Boundary’ which creates the topological boundary of the polygon.
Once you have the ‘Boundary’ layer, we can apply our expression to compute the border lengths using the ‘Field Calculator’ algorithm.
We will add a new string field ‘lengths’ which will have length in miles for each neighboring state. The key function here is
array_foreach. This function takes a list and applies an expression to each element and returns a list with computed values. Each member of the list can be accessed using
@element. The expression can be broken down into following logical parts
- Step 1: Convert the comma-separated list of neighbors to a list (function
- Step 2: For each neighboring state code, find the corresponding feature geometry (function
- Step 3: Find the intersecting geometry between the feature and the neighbor (function
- Step 4: Compute the length of the intersecting geometry. (function
- Step 5: The length is computed in the layer CRS units which are meters, so we convert it to miles and round it down to the nearest integer (function
- Step 6: Convert the resulting list to a comma-separated string to be saved to the field. (function
The final expression is below
array_to_string(array_foreach(string_to_array( "neighbors" ), floor(0.000621371*length(intersection(geometry(get_feature('Boundary', 'STUSPS', trim(@element))), $geometry)))))
The resulting ‘Calculated’ layer will have a field ‘length’ which will have length of the shared border for each neighbor.
A quick check on the results show they make sense. California shares it borders with Nevada, Arizona and Oregon with respective borders being 613 mi, 235 mi, 215 mi long. These match up quite well with lengths recorded in other sources.
Length of Complex Curves
From this dataset, we derive that the longest state border in the United States is between Texas and Oklahoma at 922 miles. This is much higher than other estimates, possibly due to the very complex shape of the red river that forms the border in the census bureau shapefile. I could not find any official source of the border length, but if you know any – please let me know in the comments. The length of curved lines such as these can also vary widely depending on the scale of measurement. Read more at Ireland’s Fractal Coastline
Check out my other posts on aggregate() expressions to see other applications of this very powerful function.