# Spatial Homogeneity Testing of Raingauge Data with Advanced QGIS Expressions

Rainfall is arguably the most frequently measured hydro-meteorological variable. It is a required input for many hydrological applications like runoff computations, flood forecasting as well as engineering design of structures. However, rainfall data in its raw form contain many gaps and inconsistent values. Therefore it is important to do rigorous validation of rain-gauge observation before incorporating them into analysis.

World Bank’s National Hydrology Project (NHP) prescribes a set of primary and secondary validation methods in the Manual of Rainfall Data Validation.
Of particular interest to me are the spatial methods aimed to identify suspect values by comparison with neighboring stations. This spatial homogeneity test requires complex spatial and statistical data processing that can be quite challenging. I got an opportunity to work on a project that required automating the entire process of identifying and testing suspect stations. I ended up implementing it in QGIS using just Expressions and Processing Modeler. The whole solution required no custom code and was easily usable by an analyst in the QGIS environment. In this post, I will explain the details of the test and show you how you can use similar techniques for your own analysis.

This workflow was presented as a live session on QGIS Open Day. You can watch the recording to understand the concepts and implementation.

# Calculating Shared Border Lengths Between Polygons

In a previous post, I showed how to use the aggregate function to find neighbor polygons using QGIS. Using aggregate functions on the same layer allows us to easily do geoprocessing operations between features of a layer. This is very useful in many analysis that would typically require writing custom python scripts.

Here I demonstrate another powerful function `array_foreach` that allows one to iterate over other features in QGIS expressions – enabling even more powerful analysis by writing just a single expression.

# Fuzzy Table Joins in QGIS

Table Joins are a way to join 2 separate layers based on a common attribute value. QGIS has a Join Attributes By Field Value algorithm that allows you to table joins. A limitation of this algorithm is that the field values must match exactly. If the values differ slightly – the join will fail. There are many times where you are trying to join 2 layers from different sources and they contain values which are similar but may not match exactly. Fortunately QGIS now has built-in fuzzy string matching functions that can be used – along with Aggregate function – to do table join based on fuzzy matches.

# Summary Aggregate and Spatial Filters in QGIS

QGIS expression engine has a powerful a summary aggregate function that can do spatial joins on the fly. This enables some very interesting uses.