Mapshaper is a free and open-source software for spatial data processing. It is written in javascript and runs in your browser without any extra plugins and can perform a range of analysis. It started out as a tool for topologically-aware simplification, but has evolved into a swiss army knife of spatial data processing tools. All processing is done in the browser locally, and I have found that it can handle large volumes of data easily and processing is usually much faster than desktop based GIS software.

Continue reading

GeoPDF is a unique data format that brings the portability of PDF to geospatial data. A GeoPDF document can present raster and vector data and preserve the georeference information. This can be a useful format for non-GIS folks to consume GIS data without needing GIS-software. While GeoPDF is a proprietary format, we have a close alternative in the open Geospatial PDF format. GDAL has added support for creating Geospatial PDF documents from version 1.10 onwards. In this post, I will show how to create a GeoPDF document containing multiple vector layers.

Get the Tools


OsGeo4W is the best way to install GDAL on Windows. The default installation gives your GDAL tools with PDF format support. You can use the GDAL tools via the OsGeo4W Shell included in the install.


KyngChaos providers a convenient GDAL installer for Mac. You also need to install the additional GeoPDF plugin to enable support for PDF format.

Once installed, add the path to GDAL library to your .bash_profile file to be able to use the commands easily from the terminal. Launch a Terminal and type in the following commands.

echo 'export PATH=/Library/Frameworks/GDAL.framework/Programs:$PATH' >> ~/.bash_profile
source ~/.bash_profile


Installation instructions will vary with the distribution. On Ubuntu, you can install the gdal-bin package.

sudo apt-get install gdal-bin

Verify GDAL Install

If you already have GDAL installed, or just installed it, run the following command in a terminal to verify that your GDAL installation is working and has support for GeoPDF format.

gdalinfo --formats | grep -i pdf

If you see Geospatial PDF printed in the output – you are all set. If you do not get any output or get an error, your install is not correctly configured.

Get the Data

For this example, I chose to use OpenStreetMap Metro Extracts from MapZen. Download the shapefiles (OSM2PGSQL SHP format) for the city of your choice. I am using the extract for Bangalore city in this example. Unzip the downloaded file to a folder on your computer.


The process for creating a GeoPDF file from a bunch of shapefiles is the matter of running a single gdal_translatecommand. But we need to prepare the data and figure out the correct command-line options. So follow along to understand how you can arrive at the final command – or simply scroll to the end to see the final command-line. has a comprehensive overview of all the options available for GeoPDF creation via GDAL. The follow steps are adapted and simplified version of that guide.

  • First step is to create a .vrt file that can hold all the vector layers we want in the PDF. If you just need a single layer in the PDF, you can skip creating the .vrt file and directly reference the layer in place of the VRT. Note the <SrcSQL> tag in the VRT file. This is for filtering out all features where the ‘name’ field is empty. You can leave that out or modify to suit your dataset. Name this file osm.vrt and save it on the same folder with your data.
    <OGRVRTLayer name="roads">
        <SrcSQL dialect="sqlite">SELECT name, highway, geometry from bengaluru_india_osm_line where name is not NULL</SrcSQL>
    <OGRVRTLayer name="pois">
        <SrcSQL dialect="sqlite">SELECT name, geometry from bengaluru_india_osm_point where name is not NULL</SrcSQL>
  • GeoPDF is a raster format that can overlay vectors on top. So we need a raster layer as the base. If you have some satellite imagery or scanned raster for the area, you can use it as the base layer, or we can create an empty raster for the extent of the vector layer. ogrtindex command creates a bounding box polygon from the given input layers. gdal_rasterize command then fills this polygon with the given value and creates a raster. The -tr option specifies the pixel resolution of the raster in degrees. You can tweak that to get the output size you need. cd to the directory where you have extracted the vector layers and run the following commands.
cd Users\Ujaval\Downloads\bengaluru_india.osm2pgsql-shapefiles

ogrtindex -accept _different_schemas extent.shp osm.vrt

gdal_rasterize -burn 255 -ot Byte -tr 0.0001 0.0001 extent.shp bangalore.tif
  • Now we can convert the empty bangalore.tif raster to a PDF – overlaying the vector layers from the osm.vrt file.
gdal_translate -of PDF -a_srs EPSG:4326 bangalore.tif bangalore.pdf -co OGR_DATASOURCE=osm.vrt -co OGR_DISPLAY_FIELD="name"
  • Once the conversion finishes, you can open the resulting bangalore.pdf file in any PDF viewer. Opening it in Adobe Acrobat viewer, you can see the map data layers. You can browse the features in the layer panel, search for any attribute value and zoom/pan the map.
  • Another popular use of GeoPDF files is to use it as offline base maps using programs such as Avenza PDF Maps. Loading the bangalore.pdf file on Avenza Maps on your mobile phone, you can use the GPS to view your current location or trace a GPS route on top. Search also works across layers in the PDF.

You can download the sample bangalore.pdf Geospatial PDF format file for exploring the format yourself.

Mapshaper is a free and open-source tool that is best known for fast and easy simplification. Other tools for simplification – like QGIS or ogr2ogr – do not preserve topology while simplifying. This means you may get sliver polygons or missing intersections. Mapshaper performs topologically-aware simplification and gives you much more control on the process.

Other popular open-source tools, PostGIS and GRASS can do topologically-aware simplifications as well.
But Mapshaper is much more than a simplification tool. It is in active development and has many more data processing and editing capabilities now. It also has a command-line version of the tool which can be run from a terminal. In this post, we will explore the command-line tool to carry out some complex geoprocessing tasks.

Mapshaper is a Node.js application. Download and install Node.js for your platform. You will need the Node Package Manager (NPM) to install mapshaper, so make sure it is enabled while going through the installer.

Once Node.js is installed, launch the Windows Command Prompt (cmd.exe) and run the following command to install mapshaper.

npm install -g mapshaper

Get the Data

Review the data and problem statement from the Performing Table Joins tutorial. Download the Census Tracts shapefile and the Population CSV ca_tracts_pop.csv. Unzip the file and extract it to a folder.


Mapshaper command takes an input, an output and a sequence of commands to execute. Each command is followed by options specific to that command. All the commands and options are well documented at the Mapshaper Wiki.

Let’s start with simplification. We will take the census tracts shapefile and simplify it to reduce the number of vertices and the total size. The command for simplification is -simplify. You can supply a percentage value as an option to specify how aggressiveness of the simplification. Another useful option is keep-shapes which ensures that none of the polygons from the input will get deleted. Run the following command. Make sure you cd to the directory where the data has been downloaded.

Note: The percentage value in the -simplify command can be a little misleading. The value indicates how many vertices to keep and not how many to remove. So a lower value would result in MORE simplification

mapshaper -i tl_2013_06_tract\tl_2013_06_tract.shp -simplify 20% keep-shapes -o output.shp

Mapshaper can also do Table Joins. We can now join the population field D001 from the ca_tracts_pop.csv file. The join will match the fields we specify as keys and add it to the output file. For the join to work correctly, we need to specify the field types in the CSV file. (Similar to how a .csvt file is needed by QGIS). We can ‘chain’ the -join command after the -simplify command to perform both the operation in a single command.

mapshaper -i tl_2013_06_tract\tl_2013_06_tract.shp -simplify 20% keep-shapes -join ca_tracts_pop.csv keys=GEOID,GEO.id2 field-types GEO.id2:str,D001:num -o output.shp

Mapshaper can also dissolve features. In my testing, Mapshaper’s dissolve operation was many times faster than QGIS or GRASS. Let’s add a -dissolve command and merge all census tracts for a county. We can also sum up the values of the D001field to get the total population of the county from the sum of individual census tracts.

mapshaper -i tl_2013_06_tract\tl_2013_06_tract.shp -simplify 20% keep-shapes -join ca_tracts_pop.csv keys=GEOID,GEO.id2 field-types GEO.id2:str,D001:num -dissolve COUNTYFP sum-fields D001 -o output.shp

The output format needed by many web apps is geojson or topojson. Mapshaper can write the output in these formats as well. Let’s add a format=geojson option to the -o command to write a geojson output.

mapshaper -i tl_2013_06_tract\tl_2013_06_tract.shp -simplify 20% keep-shapes -join ca_tracts_pop.csv keys=GEOID,GEO.id2 field-types GEO.id2:str,D001:num -dissolve COUNTYFP sum-fields D001 -o format=geojson output.geojson

Finally, let’s visualize our output. Go to and upload the resulting output.geojson. You will be able to visualize the output shapes and their properties

By now, you must have figured out that we have a very powerful tool on our hands. In just a single line of command and just a few seconds of computing, we did Simplification, Table Join, Dissolve and Format translation.

I recently had a need to calculate distance between a large number of latitude/longitude coordinate pairs. There are many options available if you want to import these in a GIS and run analysis. But there is a simpler and much more accesible way if you aren’t doing very high accuracy calculations.

Here I have a spreadsheet which implements the well-known Haversine formula to calculate distance between 2 coordinates. You can structure your point coordinates into 4 columns Lat1, Lon1, Lat2, Lon2 in decimal degrees and the distance will be calculated in meters.

You can give it a try. Just open this spreadsheet, make a copy it and play with it as you like.

The raw formula is below (Thanks to the reader Samuel who suggested it)

=2 * 6371000 * ASIN(SQRT((SIN((LAT2*(3.14159/180)-LAT1*(3.14159/180))/2))^2+COS(LAT2*(3.14159/180))*COS(LAT1*(3.14159/180))*SIN(((LONG2*(3.14159/180)-LONG1*(3.14159/180))/2))^2))

I was at #cartonama conference in Bangalore on 23rd September, 2012. It was wonderful to be among geogeeks and see the energy and enthusiasm. Got to see some great demos and LBS apps from startups in India. Here’s a run down from my notes and bookmarks.

  • Traffic cams in Bangalore: This was probably the coolest demo for me. Love the simplicity and usefulness. Just open this URL in your smartphone browser and get real-time feed from the traffic-cam neareast to you from BTIS.
  • Thru the Gall: A Layar-based android app to solve the last-mile direcction problem in India. Bubbles guide you through the narrow-lanes (galli).
  • Padma: Search and Browse Indian documentaries through locations of the clips in the documentaries
  • DelightCircle: LBS app to discover offers and discounts near you. Surprised at the coverage and partnerships they are able to genrate in the short time. Powered by MongoDB at the backend.
  • Google Maps API app for property search. Nicely done and simple UI and fast lad times. The team talked about challenges and techniques they use to de-dup the listings and pin the individual listings to the apartment block.
  • Yahoo: I had forgotten about yahoo as a geo player. But from the talk the team seems pretty excited about their geo APIs. The Geo APIs are used by many yahoo properties internally and available to developers too. The API to extract locations from free-form-text was cool. YQL is a SQL-like query interface for their API., APIS at
  • NextDrop: Very interesting idea. Most places in India do not have a regular water supply and people’s day revolves around when they are able to collecct water. They have built a system that alerts subscribers by sms when they can expect water based on the information from the utility company. Subscribers pay Rs.10/month to get this data. Their challenge revoled around locating their customers on a map – both for billing as well as determining which pipeline will supply them water.
  • Lokasi: Simple android app that allows you to share your current location by sms. Reverse-geocoding + location identifier. . collected POIs + street geometry for entire bangalore is about 6 months with a 10-person team.
  • Brief history of map making: Fun talk with lot of pretty and interesting maps. No geo conference is complete without a passionate discussion on projections 🙂
  • Chalo BEST: Bus route planner for Mumbai using OpenTripPlanner and GTFS feed from BEST. Challenges in converting messy spreadsheet data from the agengy in to GTFS feed.
  • Geohash: overview of the geohash system. Lots of discussion around what geohash is good for.

India’s first FOSS4G – India conference was organized on 25-26 October, 2012 at Hyderabad. This was a surprisingly low-profile event and there was very little buzz about it on the internet. I learnt about it just in time to register and make it to the event. Here I am sharing some notes and impressions. (notes are mostly from memory, so please excuse if I missed something)

  • I haven’t been to other FOSS4G events, but this was unlike any other Geo conference I have attended. It was very formal and government-centric. All presentations were from government researchers, employees or students.
  • The penetration of open-source software within government was deeper than I anticipated and the decision-makers – including the politicians recognize the benefits of using open technologies. The Chief Guest – Shri V. Aruna Kumar, MP from Rajahmundry district commented that he realizes that the real cost of software is not the upfront cost, but the lifetime maintenance cost and open-source and open-standards can really benefit in the longer run.
  • Keynote’s main message was that we should really focus on delivering applications that solve real problems. According to the speaker – Shri Anoop Singh, Special Secretary to AP Government – we have not even begun to realize the full potential of GIS within governments. For most end-users and decision makers, the ‘Science’ or ‘System’ in GIS isn’t important and GIS should really be about Geographic Information Services.
  • India is seen largely as a follower in the open source movement and no big open-source projects have come out of research institutions here. To address this, International Institute of Information Technology released one of their projects – LSIViewer as open-source. The github upload was done during the inauguration of the conference by the chief guest!
  • Prof. Venkatesh Raghavan receiving 2012 Sol Katz award Prof. Venkatesh Raghavan, Osaka City University was chosen as the winner of the 2012 Sol Katz ward by OsGeo. The award was given to him during the conference.
  • Postgrasql + PostGIS, Geoserver and Openlayers seem to be the platform of choice. Almost all applications I saw were built using one of these.
  • ELOGeo portal was shown an example of learning resources available for open source geo tools.
  • There is virtually no open data for India given the strict geo data policy. But even within the government departments – there is a lot of reluctance in data sharing. Everyone talked about this problem of data being guarded very closely and seldom shared. But some forward thinking departments, especially in Tamil Nadu, have found a way around it by promoting Web Services. They publish a WMS feed for the datasets. That way other departments can use the data without really having any ownership of it. This seems to be working well.
  • Ms. Mahalakshmi from National Informatics Center (NIC) in Tamil Nadu demonstrated some of most interesting internal applications built by them. They have built web services for different departments and each department uses these web services for their application. Administrative boundaries, census data, land use maps, land parcels, police data and planning related data are some examples which are available via WMS. She also showed a pretty cool app that BSNL – Chennai uses to identify fault points in their 2G network. The map is updated in real-time with pings from their different 2G sites around Chennai.
  • BHUVAN – Indian Space Research Organization’s (ISRO) portal for disseminating geodata has come a long way since it’s launch a few years back. It’s built completely using open-source stack and it’s a pretty snappy app. For the first time, a lot of thematic datasets are available freely to the public via WMS. They are also making some medium and low resolution raw/derived data from Indian remote sensing satellites available for download. The downloads are for personal use only, but this is a big step forward in India’s data access policy which has been very restrictive so far.

In summary, some positives and challenges for Open-source software described by various speakers.

Why open source is very attractive to governments

  • No upfront cost. Don’t have to navigate complicated and time-consuming approval and procurement process. Easy to get started at low-cost and do a proof-of-concept.
  • Standards Compliant. Rising awareness within the government of being standards compliant. Choosing a proprietary solution which is not standards compliant can cause problems and raise questions from others in future.
  • No vendor lock-in. Can hire another consultant to run or develop services since all code is available.

Challenges to wider open source adoption

  • Lack of support. Not many companies providing development and support. Most applications shown were written and maintained by in-house staff. Big opportunities for consultants who can provide support for open source tools.
  • Change aversion: On the desktop, most people are used to proprietary software from their education or prior work. One example given was reluctance in the government to switch to Open Office. Free versions of MS Office is easy to come by and no incentive to learn and adopt open source solution. Using open-source software in education is a way to increase the adoption.
  • Lack of Mature solutions: Politicians and administrators felt they had limited time in their tenure to implement whatever changes they want and show progress. There is always the attraction to choose a well known solution and get the results. The key message was that they feel the open-source software is mature but there are no mature solutions based on these that the governments can adopt without much risk.