Calculating Shared Border Lengths Between Polygons

In a previous post, I showed how to use the aggregate function to find neighbor polygons using QGIS. Using aggregate functions on the same layer allows us to easily do geoprocessing operations between features of a layer. This is very useful in many analysis that would typically require writing custom python scripts.

Here I demonstrate another powerful function array_foreach that allows one to iterate over other features in QGIS expressions – enabling even more powerful analysis by writing just a single expression.

Here’s the problem statement. Given a polygon layer – such as US States – calculate what is the length of shared border between each polygon.

This will first require us to find out which states share border with each other and then calculate length of their intersection. For a pair of polygons – this can be done manually or by putting them in separate layers and intersecting them. But since we have 50 states – doing it for each pair of states manually is not possible. But with help of the QGIS expression engine, it can be done in using built-in functions without any custom code.

If you want to follow along, download the states shapefile cb_2018_us_state_500k.zip provided by the United States Census Bureau.

Locate the downloaded file in the Browser panel. Expand it and drag the cb_2018_us_state_500k.shp to the canvas. First we need to prepare the layer. Since the exercise involves calculating lengths, the layer needs to be in a projected CRS. (While QGIS can perform ellipsoidal length calculations for layers in geographic CRSs, the length function – which we need – can only uses layer CRS for calculation). From Processing Toolbox, search and locate the ‘Reproject layer’ algorithm.

Set the Target CRS to ‘EPSG:5070 NAD83 Conus Albers’. This is a general purpose projection with minimal distortions in the continental united states with meters as units. Once the processing finishes and the new ‘Reprojected’ layer loads, change the project CRS also to EPSG:5070. Next, search and locate the ‘Field Calculator’ algorithm.

We will now add a new string field called ‘neighbors’ which will have a comma-separated list of all neighboring states. The state code is contained in the column STUSPS. So we can use the aggregate function array_agg expression to find all intersecting features and extract the state code. The resulting list has the state itself since the geometry intersects with itself. We use the array_remove_all function to remove the code of the state being evaluated from the results.

array_to_string(array_remove_all(aggregate(
layer:= 'Reprojected',
aggregate:='array_agg',
expression:=STUSPS,
filter:=intersects($geometry, geometry(@parent))
), "STUSPS"))

A new layer ‘Calculated’ will load once the processing finishes. This layer now has a ‘neighbor’ column which has a list of all neighboring states for each feature.

Now we are ready to calculate the border lengths. The idea is to intersect each feature with all the neighboring features and calculate the length of the intersection. But the features are polygons and since they just touch each other, and the intersection of them will be empty. So the trick is to first convert the polygon features to polylines. The intersection of the 2 neighboring polyline features will be a polyline representing the border. You may be tempted to use the ‘Polygons to lines’ algorithm, but this algorithm assigns the shared boundary between the polygons to only one of the resulting features. The correct algorithm is ‘Boundary’ which creates the topological boundary of the polygon.

Once you have the ‘Boundary’ layer, we can apply our expression to compute the border lengths using the ‘Field Calculator’ algorithm.

We will add a new string field ‘lengths’ which will have length in miles for each neighboring state. The key function here is array_foreach. This function takes a list and applies an expression to each element and returns a list with computed values. Each member of the list can be accessed using @element. The expression can be broken down into following logical parts

  • Step 1: Convert the comma-separated list of neighbors to a list (function string_to_array)
  • Step 2: For each neighboring state code, find the corresponding feature geometry (function get_feature and geometry)
  • Step 3: Find the intersecting geometry between the feature and the neighbor (function intersection)
  • Step 4: Compute the length of the intersecting geometry. (function length)
  • Step 5: The length is computed in the layer CRS units which are meters, so we convert it to miles and round it down to the nearest integer (function floor)
  • Step 6: Convert the resulting list to a comma-separated string to be saved to the field. (function array_to_string)

The final expression is below

array_to_string(array_foreach(string_to_array( "neighbors" ),
floor(0.000621371*length(intersection(geometry(get_feature('Boundary', 'STUSPS', trim(@element))), $geometry)))))

The resulting ‘Calculated’ layer will have a field ‘length’ which will have length of the shared border for each neighbor.

A quick check on the results show they make sense. California shares it borders with Nevada, Arizona and Oregon with respective borders being 613 mi, 235 mi, 215 mi long. These match up quite well with lengths recorded in other sources.

Length of Complex Curves

From this dataset, we derive that the longest state border in the United States is between Texas and Oklahoma at 922 miles. This is much higher than other estimates, possibly due to the very complex shape of the red river that forms the border in the census bureau shapefile. I could not find any official source of the border length, but if you know any – please let me know in the comments. The length of curved lines such as these can also vary widely depending on the scale of measurement. Read more at Ireland’s Fractal Coastline

17 Comments

Leave a Comment

    • thanks for this outstanding blog. Just a quick question why the result of the length in mil instead of meter as the CRS.

  1. Hey, thanks for this, its really useful. I have an issue, though, the final step only gives me the length if there is only one neighboring region. Otherwise I get a zero. :S

    Also, how would you adapt this methodology to calculate the coastal length of all coastal regions?Something along the lines of: Total boundary length minus the sum of the neighboring region length (what you have already done).

    • The reason for your error would be if you chose ‘Integer’ as the field type so it fails when there is a comma separated string. Make sure you are adding a field of type ‘String’.

      For coastline, you can definitely do the calculation as you mentioned. Add another field that is perimeter – sum of neighbors.

      To get sum of neighbors from the array, you will have to use the eval() function. See the trick I show in this video https://youtu.be/IXPCec8vgLA?t=866

      • I made sure I selected ‘String’. That is not the issue. I am doing this for Europe, and I think the issue is with selecting the CRS. I’m having trouble working out which is the source and which is my target CRS when I reproject…

        Thanks for the link to the video.

  2. […] We now have a list of neighbors for each station in the layer stations_with_neighbors. To apply the spatial homogeneity test, we should know the distances to each station. We can now use the array_foreach() function to add a new field for every station with the distance to each of its neighbors. I have outlined a similar technique for measure border lengths in detail in my post Calculating Shared Border Lengths Between Polygons. […]

  3. .Hi. This is really gre3at stuff – I’d never get close to this result without the above.

    I am at the stage of calculating the boundary lengths – all previous has worked.

    But what I am getting in the ‘lengths’ field is just a list of commas with no values.

    This is my version of your code:

    array_to_string(

    array_foreach(

    string_to_array(

    “Neighbours”),

    floor(

    length(

    intersection(

    geometry(

    get_feature(

    ‘Boundary’, ‘$ID’,

    trim(@element

    ))),

    $geometry

    )))))

    Field $ID is a string(9)
    Field ID is a qlonglong int8
    … but contain the same ID data.

    Are you able to advise, please?

    Thank you

    Bruce Mitchell

      • Thank you Ujaval, I agree with ‘ID’ instead of ‘$ID’. However, I am finding that while your method works when using the Processing Toolbox, when I try exactly the same sequence using the Graphical Modeller, the lengths are not calculated.

        I copied the code that worked from the Processing Toolbox window into the Graphical Modeller, only to find that it no longer did.

        Output preview in Processing Toolbox showed ‘1306,944,639’
        Output preview in Graphical Modeller showed ‘,,’

        As far as I can see, the process is IDENTICAL. However, when done this way, you only get the comma separated list, without any measurements.

        Is this a bug in QGIS, or am I missing something?

        The code is:

        array_to_string(array_foreach(string_to_array(“Neighbours”),
        floor(length(intersection( geometry
        (get_feature(‘Boundary’, ‘ID’ ,@element)), $geometry )))))

        Occasionally it works in Graphical Modeller and I get the lengths. Then, if I run it again without changing anything, it doesn’t and I’m back to the comma separated empty list.

        Might this have something to do with the output from the two stages (boundary and Field Calc?) Have tried Temp file, SHP and Geopackage.

        Many thanks,

        Bruce

      • Don’t worry about output preview because in modeler you do not have access to attributes of previous layer (which will be computed when model runs). I have successfully used it from processing modeler. Aggregate function doesn’t work in modeler (due to how intermediate layers are computed) but array_foreach works fine consistently for me. Your will have to change the codes to refer to the layer’s using variable syntax. See the last section of this post for details https://spatialthoughts.com/2020/11/26/spatial-homogeneity-testing-qgis/

  4. Thank you for your very quick reply. Just a quckie forbefore I get into applying your suggestion: I notice your first screenshot in the BUILD A PROCESSING MODEL section is titled Model Designer, whereas I have been using Processing Modeller (QGIS 3.10.3) . Has Model Designer taken over from Processing Modeller in a more recent version, or is it another thing altogether, please?

    Many thanks,

      • I am making steps towards my ultimate goal, but ought to check that that goal is achievable. I hope you will be able to advise – you appear to be eminently well qualified to do so 🙂

        I have 110,000 small areal units for which I have the centroids only. These fit into 1,400 districts. I want to generate Thiessen Polygons for the small areal units, and to clip these to the larger districts.

        I have done all the above, and broken all multiparts into singleparts. I have identified neighbours and shared boundary lengths for all singleparts.

        Next, I intend to dissolve …

        each Thiessen singlepart fragment that does not contain a small areal units centroid
        into
        the neighbouring Thiessen singlepart fragments that DOES contain a small areal units centroid
        AND
        with which it shares the longest boundary.

        Can you advise please whether this is possible? It seems so logically.

        ‘How’ isn’t yet an issue – I just want to be reassured that the path is possible, so I am just asking for a simple one-liner response..

        Many thanks.

        Bruce

      • To be clear, the fragments without the Centroid would only be at the edges of the districts, right?

        If that’s the case, you can add a new column with the id of neighbour with longest boundary if there is no Centroid inside, otherwise just add the id of the polygon itself (use if condition). After that dissolve by the newly added column and you are done.

      • Thank you very much, Ujaval. I’m looking into that now. Yes, all fragments without centroids are on periphery of district polygons.

        Kind regards,

        Bruce

  5. Hi!

    First off: Thank you SO much for your post. I was so happy when I found it I did a little dance. 🙂

    I don’t work much with QGIS so this is all very new to me but your explanations are fantastic. I downloaded your dataset and everything worked out perfectly.

    However, when I tried it with my own dataset, I ran into a few problems, and I was wondering if you coud point me in the right direction to solve them?

    1 – I can calculate the neighbouring polygons (yay!) but when I do, two of my polygons suddenly vanish. They are not in the “calculated” layer. Do you happen to know why?

    2 – When I try to calculate the final intersection lenghts, I get a combination of “,0,,0,0,,” instead of actual values. I made sure to select “string” as a field type, and to pick the right CRS. Where might my problme lie? (I literally only changed the field name of the ID and didn’t touch anything else in the code.)

    and

    3 – Is there any way to filter out neighbors on the basis of another layer? In my case, I have a separate layer that shows water courses as lines. I want all my terrestrial units to show up as neighbours of each other – unless there is a water course separating them. (The geometries are inputs for a model.)

    I would be INCREDIBLY grateful for any and all help on this. In any case – this entire site is AMAZING: you’re already a life-saver for this desperately tired (and desperate) grad student. 🙂

    All the best,

    Christina

    • Hi Christina – glad the post turned out to be useful for you.

      Regarding 1) it is likely that your polygons have invalid geometry. Run the “Fix Geometries” algorithm and try running the calculations on the result.

      2) Could be an error in the Expression. Hard to say without looking at your dataset and expression. If you can, share the data or a sample at ujaval@spatialthoughts.com

      3) Maybe first convert the water courses layer to polygons (Lines to Polygon algorithm), then run Difference algorithm between your terrestrial polygons and the water polygons – giving you a geometry of polygons which will work as expected.

      • Hi!

        Thank you so much for your quick response – I’m honestly blown away. 🙂

        I tried to run the fix geometries option that you mentioned, but without luck there. I’ll send you a quick email in a moment.

        Thank you so so much for taking a look! Going by how amazingly proficient you are, you can probably figure out in 30 seconds flat what I’ve spent four hours trying to puzzle out.

        You’re the best!

        Christina

Leave a Reply to ujaval Cancel reply