Using PostGIS to answer geodata questions

One of the biggest challenges when working with large sets of data is to find the least costly workflow that you have to follow in order to get the most accurate answers.

Let’s say you have a huge dataset composed of all sorts of geometry features (points, lines, areas etc.) and you want to do a bit of cleaning – because messy and redundant information is no fun!

So you might be thinking “Hmmm… which are the areas that have an unnecessary high density of points?”

The same issue can arise when working with OpenStreetMap data. This can be easily solved using PostGIS and a command line tool that we’ve created and using.

Note: The following steps require a Linux environment, Postgresql 9.x, PostGIS 2.x, Osmosis 0.43+, QGIS 2.12.2+

Getting the data

Download an *.osm.pbf file using command line:


This is the metro extract for San Francisco, provided by Mapzen. Geofabrik is also a very good resource for OSM data extracts.

In the same folder, download SCOPE – databaSe Creator Osmosis Postgis loadEr.


Make sure to set the file to be executable by using

chmod +x

Load the data

Using SCOPE and following the instructions on the screen, load the *.osm.pbf into a database.

SCOPE automatically creates the database with hstore and postgis extensions and the pgsnapshot schema.

Play with the data

Now that you have the data set up, you can easily query it using the DB Manager from QGIS and some PostGIS scripts.

Interesting examples

For example, using the find_duplicate_nodes query, we can see that this building (@20.805088941495338, -104.92877615339032), appears on the same spot 23 times!


The one next to it (@20.8054225, -104.9278152) appears 22 times!


The node density for these areas (@20.4411867, -97.3172739) is too high – 168 nodes!


Also, 171 nodes for a small fence segment (@46.7487683, 23.559687)!



Feel free to fork the GitHub repository and modify the code to suit your needs! Also, if you feel insipred, you can suggest a better and shorter name or acronym for SCOPE!


OSM Mapping party – spring edition


On the 17th of April we had our first Mapping party event for this year. Our main focus was to improve the map of our hometown by reflecting the latest changes.                                                                                                   Cluj-Napoca is a dynamic city, many new buildings was constructed; POIs, turn restrictions, addresses have been changed and appeared since the last field mapping.

Around 30 map enthusiast show up in Sunday morning for the Mapping party. There were both experienced mappers and newbies present at the event. The event had started with a morning coffee and some instructions regarding data collections.

For data collections we used the following tools:
• Field papers: our colleague Florin Badita had took some time before the event and had created field papers for several city areas


• GPS tracker applications: OSMTracker, OSMAnd, Pushpin OSM and so an
OpenStreetView application 

We have divided the people into smaller groups of 2-3 persons. After each group had chosen an area to map we went out to collect the data.
On the afternoon we headed back to our meeting location to add the collected data into OpenStreetMap.

An outcome overview of our mapping effort is presented on the following images:




Improve OSM adds missing roads in Guatemala

In a new data release today, we added about 500 tiles worth of missing roads in and around Guatemala!

Missing roads near Coatepeque, Guatemala
Missing roads near Coatepeque, Guatemala in JOSM. Imagery from Bing.

We are excited to be adding more and more Missing Roads data to ImproveOSM using GPS data from our own users as well as from data partners, like we did in Brazil and in this case.

You will notice that the tiles look a little different from the ones you are used to if you have used ImproveOSM before: they don’t show the individual points. This is because this particular data was processed a little differently. If you use JOSM, you will also see an update to the ImproveOSM plugin to accommodate this change.

While you are looking at the new Missing Roads, perhaps you will also notice some other recent improvements to the ImproveOSM web site. We re-ran all tiles based on new map data from mid-April, and we improved our turn restriction detection so we won’t show a missing turn restriction when OSM already has a ‘only straight on’ restriction.

Happy Mapping!


Turn restrictions – a vital part of any routing system

The best part of using everyday OSM technologies and relying on OSM to make sure that you get “there” on time is that you can directly influence the quality of the experience.

Regardless which OSM technology you’ll be using, to provide you the best experience possible, the routing software has to know as much information as possible about the roads between you and your destination: one-way streets, turn restrictions, speed limits, road closures and much more.

For example, the turn restrictions contribute significantly to the total travel time, and to the correctness of the route altogether, thus, by ignoring them in the traffic network model, essential characteristics of the network might be missed, leading to substandard and unreasonable paths.

Dealing with turn restrictions in OSM

To help us navigate the complexities of properly translating real map scenarios to the ways and points schema of OSM we will rely on JOSM with the turn restrictions plugin installed.

Turn restrictions in OSM are handled by creating a relation

A relation is one of the core data elements that consists of one or more tags and also and ordered list of one or more nodes, ways and/or relations as members which is used to define logical or geographic relationships between other elements. (source)

There is a mandatory requirement when creating a turn restriction relation: it has to consist of minimum three members and must have assigned two tags. (see below example)

The ‘type=restriction’ flags the relation as a turn restriction and ‘restriction=no_u_turn’ indicates the restriction type.

A ‘no_’ type relation can also be represented in map data as an ‘only_’ type relation. The prohibited turn restriction relation is preferred by some routing engines instead of an allowed turn restriction relation.

More details here -
More details here –; US regulatory signs –

Members of a turn restriction relation are ways and nodes

One simple case can be a turn restriction relation that consists of three members – two ways and one node. The two ways would represent the beginning (‘from’ role) and end (‘to’ role) of the turn restriction. The node would represent the continuity of travel between two ways and has a ‘via’role.

Way (A) - node (B) - way (C) sequence
Way (A) – node (B) – way (C) sequence in a ‘no_left_turn’ restriction relation.

Another case is where a turn restriction relation can consist of three or more ways. Two ways from this type of relation would represent the beginning and end of the turn restriction and at least one way would represent the continuity of travel between the aforementioned ways (‘via’ role).

Way (A) - way (B) - way (C) sequence in a no_u_turn restriction relation
Way (A) – way (B) – way (C) sequence in a ‘no_u_turn’ restriction relation.

Workflow for adding turn restrictions

The traditional way

Using the embedded relation editor available in JOSM. A slight disadvantage of this method is that you spend a bit more time to manually construct the relation. Click on the image below for how-to video.


The user-friendly way

Using the turn restrictions plugin, that automatically recognizes the type of relation and roles for each member. Click on the image below for how-to video.


Using the aforementioned tools, we have reviewed 2,000 miles of field trip footage and added nearly 2,500 turn restrictions in the LA/Orange county area, where 85% of the turn restrictions that were added to the map are no_u_turns, followed by 11% of no_left_turns, the rest being covered by the other categories.

Hopefully we’ve managed to illustrate how easy is to map turn restrictions in OSM. Now, it’s your turn!


ImproveOSM with your own GPS data – a Field Report

We launched ImproveOSM about 6 months ago as a way to turn the vast amounts of GPS data that Scout users give us into useful and actionable hints mappers can use to add turn restrictions, missing roads as well as wrong or missing one-way streets. The response has been incredible — since we launched, more than 26 thousand hints have been processed, leading to more than 16 thousand improvements to the map worldwide. I think that is a fantastic result, and we will keep working to make ImproveOSM better based on your feedback.
Initially, we just used our own GPS data to generate the hints. But there is no reason why we couldn’t process any GPS data we can get from other sources. So I was really excited when long time Brazil mapper Wille Marcel got in touch with a cool idea. He worked with the Brazilian Environment Ministry, which collects GPS data of the vehicles that work in environmental monitoring. Most of the data are in rural areas where OSM is much less complete. So this was a perfect fit for ImproveOSM’s missing roads tool.After getting the proper permissions from the agency, Wille sent us the GPS data and we started analyzing it.


We quickly realized that the GPS data is much less dense than what we are used to working with. Some missing roads were only driven once. Our algorithm, tuned to higher density data, initially only detected a few tiles. We decided to loosen the detection threshold significantly for this particular dataset. After a few iterations of tweaking and testing, we ended up withmore than 5000 tiles containing missing roads based on Wille’s GPS data.


The missing roads in Brazil are on ImproveOSM now, so why not go to the web site or fire up the ImproveOSM JOSM plugin and help the Brazilian community out by adding some missing roads?

If you are in a similar position as Wille and know of a source of free and open GPS data for your country, please get in touch with me so we can look at the data and see if we can include it in ImproveOSM.

We are already working with a number of other folks who have lots of GPS data. Soon, the number of missing roads, one-ways, and turn restrictions in ImproveOSM will be much, much bigger. We are also working on a host of new features, so I hope you will stay tuned to the ImproveOSM blog to be among the first to hear about what we have up our sleeves for ImproveOSM and other OSM related projects we are working on. And follow us on Twitter at @ImproveOSM!

 See also Wille’s post about this collaboration (in Portuguese).


Help map some sidewalks for cities in the U.S.

United States cities are built for cars, with very few exceptions. From where I am sitting right now, I see this:


Cars zooming by incessantly at 70kph.

Finding your way in an urban space that is designed this way is tricky – and often dangerous – if you are walking or bicycling. Sidewalks are often not present, crossing streets can be very dangerous or even impossible. OSM has great tagging for bike lanes and sidewalks, but I find that these crucial tags are often missing on ways that need them most: the four or six lane urban arterials that you see in the picture above.

As I was sitting here asking myself how on earth I would get back to my hotel (which is 10 minutes away) safely, I thought to myself: ‘we can fix this problem and make the world a bit safer for those who can’t or won’t drive.’

MapRoulette to the rescue!

I created this challenge highlighting all primary and secondary ways that have nosidewalk tag in Tampa, Florida. (I am actually in Sarasota now, south of Tampa, but I already fixed all the ways there so that would be a boring challenge.) The idea is to look at the aerial image in JOSM or iD, see if there is a sidewalk, and add the appropriate tag. Adding sidewalk=no is actually just as important as adding both, right or left. Here is an example way from this challenge:


Even zooming further in there is no sight of a sidewalk:


So let’s add that information:


And upload!

Create a Challenge for your city

The fun part is that you can easily replicate this challenge for your own city. Here’s what to do.

Overpass Turbo

First you head over to Overpass Turbo and run the query that highlights all highway=primary and highway=secondary that have no sidewalk tag:


You can use my query as a template, replacing the GeocodeArea with the name of your city.

Once you have the results, export them to GeoJSON. Let’s use a gist:



You can now click on the gist link and see the result on GitHub as well:


We will need the ‘raw’ GeoJSON content, so click on the ‘Raw’ button and copy the link it leads you to.


Next we’ll use a little tool I created to easily turn the contents of a GeoJSON file into a MapRoulette challenge. To get it, head over to the Github repository and follow the instructions to install the tool.

The tool takes its configuration from a YAML file. The samples directory contains an example for this sidewalks challenge you can use as a template:

# the base URL for the MapRoulette server API to call
#server: "http://localhost:5000/api"

# server API admin credentials
user: devuser
password: mylittlesony

# source file or URL. You can give a list of URLs too, all data will be gathered and added to the same challenge.
# source_file: ....

# source geojson property key to use as your task identifier (optional, will use random UUID if not given)
# identifier_property = ...

# Challenge metadata, see for background
slug: sidewalks-sarasota
title: Add sidewalks to major roads in Sarasota
instruction: This way has no `sidewalk` tag. Usually you can see from the aerial imagery if there is a sidewalk or not. Please add the appropriate `sidewalk` tagging.
help: "Help make OSM be a better resource for safe, walkable streets! Many primary and secondary roads in the US are not safe for pedestrians if there is no sidewalk. This challenge highlights all `primary` and `secondary` ways that have no [`sidewalk`]( tagging whatsoever. You can help by looking at aerial imagery and adding the appropriate `sidewalk` tagging. `sidewalk=no` is just as important to have as the 'positive' values. Thanks for helping make OSM better!"

The only items you would need to change are the source_url (use the raw GeoJSON github link you just copied), the slug (use sidewalks-YOURSTATE-YOURCITY or something similar – this will be the challenge URL component in MapRoulette) and the title (change the city name).

By default this configuration will post to If you want to post you would need to get in touch with me to get the credentials.

Once you have the YAML config file in order posting to MapRoulette is as simple as:

$ ./ samples/sidewalks-sarasota.yaml --post --activate
Posting 364 tasks...
server alive: True
Updating challenge...
Reconciling tasks...

Let me know if you need any help with this or if you want me to create a challenge for you!


Check out the new tutorial section

Recently we have launched a video tutorial section on, to help our users get up to speed with our tools and let you know what are all the possibilities and all the versatility of our products for the community’

So far, we have 2 videos. One is focused on doing a gentle introduction to Improve OSM, some sort of a guided tour of the Improve OSM web site for new users. This is the perfect starting point if you’re new here.

The second one is a step-by-step guide to fixing your first one-way street in OpenStreetMap using the Improve OSM web site. You can see it here.

We’ll prepare more tutorial videos soon. Let us know what other aspects of our products you’d like to be featured in the future.


How we imported Administrative Boundaries for Mexico from INEGI

The INEGI boundaries import project is focused on importing the data of the national, state, municipal and sub-municipal level divisions present in the MGN published by the INEGI in a community monitored process.

One of the current problems in OSM regarding Mexico’s data is the incompleteness of the administrative boundaries for municipalities. Municipalities are the second-level administrative division in Mexico, the first being the state. There are 2456 municipalities, including the ones in Mexico City which are also a second-level division just with a different name – delegations.

The main goal of this process is to enhance the current OSM administrative division coverage of Mexico with open data made available by the government at the end of 2014.

Import Process

The following steps describe the entire workflow we followed to import the boundary data.

  • Step 0 – Reprojection of INEGI dataset

Before any other step, the data released by INEGI has to be reprojected to WGS84 (EPSG:4326), from ITRF92, using QGIS, and saved as .shp file. An important thing to mention is that no simplification of the boundary geometries is considered whatsoever for this or any of the subsequent steps since the geometries are official government data.

  • Step 1 – Conversion to OSM data

Download the state boundary of interest relation from OSM and save it as an .osm file. In QGIS, using Vector > Research Tools > Select by Location, select the INEGI municipalities boundaries that are within the area of interest, in this case Quintana Roo state, and export the selection as .shp file.

Municipalities in Quintana Roo state.
Municipalities in Quintana Roo, as polygons.

The exported features will be polygons. In order to process them, they must be converted to lines in QGIS using the Polygons to Lines option, available in Vector > Geometry tools. Visually, the output will look the same as when the municipalities were polygons.

Municipalities in Quintana Roo, as lines.
Municipalities in Quintana Roo, as lines.

Using ogr2osm the .shp file containing the boundaries as lines is converted into an .osm file.

Before moving forward, the resulting .osm file has to be modified a bit. Using Notepad++, open the file and search and replace <nd ref=’ with <nd ref=’- and <node id=’  with <node id=’-, so the file will be with negative id.

The negative id is important because JOSM will know that this is new data, not yet added to the map.

Next, the .osm file can be converted to an .osm.pbf file using osmosis.

  • Step 2 – Processing

We load the .osm.pbf file from the previous step into an internal tool, called Mexico Split. The tool is designed to eliminate duplicate/overlapping ways by detaching them from their parent polygons and replacing them with a single common way of the two involved polygons.

Detects overlapping ways and replaces them with a single common way.

Besides this main purpose, the tool also splits any resulting ways longer than 2000 segments in shorter ways, groups the ways in relationships according to the borders they define and adds some predefined tags to these ways and relations.

Tags added to relations:




Tags added to both ways and relations:



source=INEGI, MGN 2014 v6.2

For example, data for Bacalar municipality contains the following information:

Bacalar municipality in Quintana Roo state.
Bacalar municipality in Quintana Roo state. (click for larger image)
  • Step 3 – Backup and metrics of existing OSM data

We took a backup of the current OSM data previous to the import of the regions that are going to be impacted, using Overpass API. Also, tag related metrics have been recorded – source, population, admin_center, admin_label, wikipedia etc. in order to have an overview of the newly added information.

  • Step 4 – Delete existing data from OSM and upload fresh data

In some cases the states already have some information regarding municipality boundaries (admin_level=6). These will be deleted, but before deletion we take a look at all the features and relations, to have a very good image of what we should put back in map data after the import.

Next, we upload the municipalities on a state by state basis.

  • Step 5 – Clean/verify the newly added data

This is a very important step because we verify the data that we’ve uploaded to make sure that there are no errors and manually re-link the admin_level=6 relations to the admin_level=4 boundaries, where required. Any other manual corrections are done at this step.

An example of the newly added municipalities boundaries in Tabasco.

To ease the process of importing the municipality boundaries, we use the Mexico Import Map paint style for JOSM. It highlights the last node of every way, making it simple to see the length of every way.

Map styles - JOSM default vs. Mexico Import
Map styles – JOSM default vs. Mexico Import. (click for larger image)

The square node also has a certain degree of transparency, so we can see if there is a node under the node. To be able to work in a systematic way, it allows to quickly see duplicated nodes and see the difference between the admin_level=4 and admin_level=6.


Improve OSM Tool Highlight: Cygnus

Improve OSM is all about making OpenStreetMap better with the help of billions of GPS points that users of Scout apps and others contribute automatically. This allows us to identify missing roads, one-way streets and turn restrictions that are not in OSM yet. However, our OSM engineers think about other ways to help make OSM better as well. Looking at what is going on in the OpenStreetMap community, I see a lot of groups and individuals struggle with the challenge of using external data in OpenStreetMap. Many governments now open up data that was closed before. With proper caution and preparation, we may be able to use this data in OSM.

The technical challenge here is to combine the new, external data with what is already present in OSM. In the GIS world this is called conflation. There are some tools out there to help with this task, but they are hard to use. So we started to think about this from a practical point of view and a specific use case: what if you have an external database of just roads, and you want to add only the new roads to OSM?


We developed a tool that can do just that, and we called it Cygnus. I wrote about Cygnus on my OSM diary before, announcing it here and then writing about it being used by OSM user MikeN here. We added a link to the Cygnus web site to the ‘Other tools’ section of the ImproveOSM web site.

We made Cygnus as easy to use as possible, but it is still not straightforward. It should not be. Conflation is hard, and what you are doing is importing data into OSM, which should be done with the utmost caution! If you want to get started using Cygnus, please get in touch. We are happy to help!