In a previous post, I showed how to use the aggregate function to find neighbor polygons using QGIS. Using aggregate functions on the same layer allows us to easily do geoprocessing operations between features of a layer. This is very useful in many analysis that would typically require writing custom python scripts.

Here I demonstrate another powerful function `array_foreach`

that allows one to iterate over other features in QGIS expressions – enabling even more powerful analysis by writing just a single expression.

Here’s the problem statement. Given a polygon layer – such as US States – calculate what is the length of shared border between each polygon.

This will first require us to find out which states share border with each other and then calculate length of their intersection. For a pair of polygons – this can be done manually or by putting them in separate layers and intersecting them. But since we have 50 states – doing it for each pair of states manually is not possible. But with help of the QGIS expression engine, it can be done in using built-in functions without any custom code.

If you want to follow along, download the states shapefile cb_2018_us_state_500k.zip provided by the United States Census Bureau.

Locate the downloaded file in the Browser panel. Expand it and drag the * cb_2018_us_state_500k.shp* to the canvas. First we need to prepare the layer. Since the exercise involves calculating lengths, the layer needs to be in a projected CRS. (While QGIS can perform ellipsoidal length calculations for layers in geographic CRSs, the

`length`

function – which we need – can only uses layer CRS for calculation). From Processing Toolbox, search and locate the ‘Reproject layer’ algorithm. Set the Target CRS to ‘EPSG:5070 NAD83 Conus Albers’. This is a general purpose projection with minimal distortions in the continental united states with meters as units. Once the processing finishes and the new ‘Reprojected’ layer loads, change the project CRS also to EPSG:5070. Next, search and locate the ‘Field Calculator’ algorithm.

We will now add a new string field called ‘neighbors’ which will have a comma-separated list of all neighboring states. The state code is contained in the column STUSPS. So we can use the aggregate function `array_agg`

expression to find all intersecting features and extract the state code. The resulting list has the state itself since the geometry intersects with itself. We use the `array_remove_all`

function to remove the code of the state being evaluated from the results.

```
array_to_string(array_remove_all(aggregate(
layer:= 'Reprojected',
aggregate:='array_agg',
expression:=STUSPS,
filter:=intersects($geometry, geometry(@parent))
), "STUSPS"))
```

A new layer ‘Calculated’ will load once the processing finishes. This layer now has a ‘neighbor’ column which has a list of all neighboring states for each feature.

Now we are ready to calculate the border lengths. The idea is to intersect each feature with all the neighboring features and calculate the length of the intersection. But the features are polygons and since they just touch each other, and the intersection of them will be empty. So the trick is to first convert the polygon features to polylines. The intersection of the 2 neighboring polyline features will be a polyline representing the border. You may be tempted to use the ‘Polygons to lines’ algorithm, but this algorithm assigns the shared boundary between the polygons to only one of the resulting features. The correct algorithm is ‘Boundary’ which creates the topological boundary of the polygon.

Once you have the ‘Boundary’ layer, we can apply our expression to compute the border lengths using the ‘Field Calculator’ algorithm.

We will add a new string field ‘lengths’ which will have length in miles for each neighboring state. The key function here is `array_foreach`

. This function takes a list and applies an expression to each element and returns a list with computed values. Each member of the list can be accessed using `@element`

. The expression can be broken down into following logical parts

- Step 1: Convert the comma-separated list of neighbors to a list (function
`string_to_array`

) - Step 2: For each neighboring state code, find the corresponding feature geometry (function
`get_feature`

and`geometry`

) - Step 3: Find the intersecting geometry between the feature and the neighbor (function
`intersection`

) - Step 4: Compute the length of the intersecting geometry. (function
`length`

) - Step 5: The length is computed in the layer CRS units which are meters, so we convert it to miles and round it down to the nearest integer (function
`floor`

) - Step 6: Convert the resulting list to a comma-separated string to be saved to the field. (function
`array_to_string`

)

The final expression is below

```
array_to_string(array_foreach(string_to_array( "neighbors" ),
floor(0.000621371*length(intersection(geometry(get_feature('Boundary', 'STUSPS', trim(@element))), $geometry)))))
```

The resulting ‘Calculated’ layer will have a field ‘length’ which will have length of the shared border for each neighbor.

A quick check on the results show they make sense. California shares it borders with Nevada, Arizona and Oregon with respective borders being 613 mi, 235 mi, 215 mi long. These match up quite well with lengths recorded in other sources. The longest state border in the United States is between Texas and Oklahoma at 922 miles. This is much higher than other estimates, possibly due to the very complex shape of the red river that forms the border in the census bureau shapefile. I could not find any official source of the border length, but if you know any – please let me know in the comments.

[…] https://spatialthoughts.com/2020/04/08/calculating-shared-border-lengths/ […]

thanks for this outstanding blog. Just a quick question why the result of the length in mil instead of meter as the CRS.

Lengths are in meters. I am multiplying it by 0.000621371 to convert to miles.

Hey, thanks for this, its really useful. I have an issue, though, the final step only gives me the length if there is only one neighboring region. Otherwise I get a zero. :S

Also, how would you adapt this methodology to calculate the coastal length of all coastal regions?Something along the lines of: Total boundary length minus the sum of the neighboring region length (what you have already done).

The reason for your error would be if you chose ‘Integer’ as the field type so it fails when there is a comma separated string. Make sure you are adding a field of type ‘String’.

For coastline, you can definitely do the calculation as you mentioned. Add another field that is perimeter – sum of neighbors.

To get sum of neighbors from the array, you will have to use the eval() function. See the trick I show in this video https://youtu.be/IXPCec8vgLA?t=866

I made sure I selected ‘String’. That is not the issue. I am doing this for Europe, and I think the issue is with selecting the CRS. I’m having trouble working out which is the source and which is my target CRS when I reproject…

Thanks for the link to the video.

[…] We now have a list of neighbors for each station in the layer stations_with_neighbors. To apply the spatial homogeneity test, we should know the distances to each station. We can now use the array_foreach() function to add a new field for every station with the distance to each of its neighbors. I have outlined a similar technique for measure border lengths in detail in my post Calculating Shared Border Lengths Between Polygons. […]

[…] Then identify the neighbours, and the shared boundary lengths as per: https://spatialthoughts.com/2020/04/08/calculating-shared-border-lengths/ […]

.Hi. This is really gre3at stuff – I’d never get close to this result without the above.

I am at the stage of calculating the boundary lengths – all previous has worked.

But what I am getting in the ‘lengths’ field is just a list of commas with no values.

This is my version of your code:

array_to_string(

array_foreach(

string_to_array(

“Neighbours”),

floor(

length(

intersection(

geometry(

get_feature(

‘Boundary’, ‘$ID’,

trim(@element

))),

$geometry

)))))

Field $ID is a string(9)

Field ID is a qlonglong int8

… but contain the same ID data.

Are you able to advise, please?

Thank you

Bruce Mitchell

Make it ‘ID’ instead of ‘$ID’

get_feature(‘Boundary’, ‘$ID’, …

Thank you Ujaval, I agree with ‘ID’ instead of ‘$ID’. However, I am finding that while your method works when using the Processing Toolbox, when I try exactly the same sequence using the Graphical Modeller, the lengths are not calculated.

I copied the code that worked from the Processing Toolbox window into the Graphical Modeller, only to find that it no longer did.

Output preview in Processing Toolbox showed ‘1306,944,639’

Output preview in Graphical Modeller showed ‘,,’

As far as I can see, the process is IDENTICAL. However, when done this way, you only get the comma separated list, without any measurements.

Is this a bug in QGIS, or am I missing something?

The code is:

array_to_string(array_foreach(string_to_array(“Neighbours”),

floor(length(intersection( geometry

(get_feature(‘Boundary’, ‘ID’ ,@element)), $geometry )))))

Occasionally it works in Graphical Modeller and I get the lengths. Then, if I run it again without changing anything, it doesn’t and I’m back to the comma separated empty list.

Might this have something to do with the output from the two stages (boundary and Field Calc?) Have tried Temp file, SHP and Geopackage.

Many thanks,

Bruce

Don’t worry about output preview because in modeler you do not have access to attributes of previous layer (which will be computed when model runs). I have successfully used it from processing modeler. Aggregate function doesn’t work in modeler (due to how intermediate layers are computed) but array_foreach works fine consistently for me. Your will have to change the codes to refer to the layer’s using variable syntax. See the last section of this post for details https://spatialthoughts.com/2020/11/26/spatial-homogeneity-testing-qgis/

Thank you for your very quick reply. Just a quckie forbefore I get into applying your suggestion: I notice your first screenshot in the BUILD A PROCESSING MODEL section is titled Model Designer, whereas I have been using Processing Modeller (QGIS 3.10.3) . Has Model Designer taken over from Processing Modeller in a more recent version, or is it another thing altogether, please?

Many thanks,

Both are the same.

I am making steps towards my ultimate goal, but ought to check that that goal is achievable. I hope you will be able to advise – you appear to be eminently well qualified to do so 🙂

I have 110,000 small areal units for which I have the centroids only. These fit into 1,400 districts. I want to generate Thiessen Polygons for the small areal units, and to clip these to the larger districts.

I have done all the above, and broken all multiparts into singleparts. I have identified neighbours and shared boundary lengths for all singleparts.

Next, I intend to dissolve …

each Thiessen singlepart fragment that does not contain a small areal units centroid

into

the neighbouring Thiessen singlepart fragments that DOES contain a small areal units centroid

AND

with which it shares the longest boundary.

Can you advise please whether this is possible? It seems so logically.

‘How’ isn’t yet an issue – I just want to be reassured that the path is possible, so I am just asking for a simple one-liner response..

Many thanks.

Bruce

To be clear, the fragments without the Centroid would only be at the edges of the districts, right?

If that’s the case, you can add a new column with the id of neighbour with longest boundary if there is no Centroid inside, otherwise just add the id of the polygon itself (use if condition). After that dissolve by the newly added column and you are done.

Thank you very much, Ujaval. I’m looking into that now. Yes, all fragments without centroids are on periphery of district polygons.

Kind regards,

Bruce