Using PostGIS to answer geodata questions

One of the biggest challenges when working with large sets of data is to find the least costly workflow that you have to follow in order to get the most accurate answers.

Let’s say you have a huge dataset composed of all sorts of geometry features (points, lines, areas etc.) and you want to do a bit of cleaning – because messy and redundant information is no fun!

So you might be thinking “Hmmm… which are the areas that have an unnecessary high density of points?”

The same issue can arise when working with OpenStreetMap data. This can be easily solved using PostGIS and a command line tool that we’ve created and using.

Note: The following steps require a Linux environment, Postgresql 9.x, PostGIS 2.x, Osmosis 0.43+, QGIS 2.12.2+

Getting the data

Download an *.osm.pbf file using command line:

 wget https://s3.amazonaws.com/metro-extracts.mapzen.com/san-francisco_california.osm.pbf

This is the metro extract for San Francisco, provided by Mapzen. Geofabrik is also a very good resource for OSM data extracts.

In the same folder, download SCOPE – databaSe Creator Osmosis Postgis loadEr.

wget https://github.com/baditaflorin/osm-postgis-scripts/blob/master/scope.sh

Make sure to set the file to be executable by using

chmod +x scope.sh

Load the data

Using SCOPE and following the instructions on the screen, load the *.osm.pbf into a database.

SCOPE automatically creates the database with hstore and postgis extensions and the pgsnapshot schema.

Play with the data

Now that you have the data set up, you can easily query it using the DB Manager from QGIS and some PostGIS scripts.

Interesting examples

For example, using the find_duplicate_nodes query, we can see that this building (@20.805088941495338, -104.92877615339032), appears on the same spot 23 times!

duplicate_building

The one next to it (@20.8054225, -104.9278152) appears 22 times!

duplicate_building_2

The node density for these areas (@20.4411867, -97.3172739) is too high – 168 nodes!

nodes1

Also, 171 nodes for a small fence segment (@46.7487683, 23.559687)!

fence

the-node-density

Feel free to fork the GitHub repository and modify the code to suit your needs! Also, if you feel insipred, you can suggest a better and shorter name or acronym for SCOPE!

Facebooktwittergoogle_plus

OSM Mapping party – spring edition

 

On the 17th of April we had our first Mapping party event for this year. Our main focus was to improve the map of our hometown by reflecting the latest changes.                                                                                                   Cluj-Napoca is a dynamic city, many new buildings was constructed; POIs, turn restrictions, addresses have been changed and appeared since the last field mapping.

Around 30 map enthusiast show up in Sunday morning for the Mapping party. There were both experienced mappers and newbies present at the event. The event had started with a morning coffee and some instructions regarding data collections.

For data collections we used the following tools:
• Field papers: our colleague Florin Badita had took some time before the event and had created field papers for several city areas

field-papper1

• GPS tracker applications: OSMTracker, OSMAnd, Pushpin OSM and so an
OpenStreetView application 

We have divided the people into smaller groups of 2-3 persons. After each group had chosen an area to map we went out to collect the data.
On the afternoon we headed back to our meeting location to add the collected data into OpenStreetMap.

An outcome overview of our mapping effort is presented on the following images:

FinalEdits

FinalEditsOverview

Facebooktwittergoogle_plus

Turn restrictions – a vital part of any routing system

The best part of using everyday OSM technologies and relying on OSM to make sure that you get “there” on time is that you can directly influence the quality of the experience.

Regardless which OSM technology you’ll be using, to provide you the best experience possible, the routing software has to know as much information as possible about the roads between you and your destination: one-way streets, turn restrictions, speed limits, road closures and much more.

For example, the turn restrictions contribute significantly to the total travel time, and to the correctness of the route altogether, thus, by ignoring them in the traffic network model, essential characteristics of the network might be missed, leading to substandard and unreasonable paths.

Dealing with turn restrictions in OSM

To help us navigate the complexities of properly translating real map scenarios to the ways and points schema of OSM we will rely on JOSM with the turn restrictions plugin installed.

Turn restrictions in OSM are handled by creating a relation

A relation is one of the core data elements that consists of one or more tags and also and ordered list of one or more nodes, ways and/or relations as members which is used to define logical or geographic relationships between other elements. (source)

There is a mandatory requirement when creating a turn restriction relation: it has to consist of minimum three members and must have assigned two tags. (see below example)

structure
The ‘type=restriction’ flags the relation as a turn restriction and ‘restriction=no_u_turn’ indicates the restriction type.

A ‘no_’ type relation can also be represented in map data as an ‘only_’ type relation. The prohibited turn restriction relation is preferred by some routing engines instead of an allowed turn restriction relation.

More details here - https://wiki.openstreetmap.org/wiki/Relation:restriction
More details here – https://wiki.openstreetmap.org/wiki/Relation:restriction; US regulatory signs – http://mutcd.fhwa.dot.gov/services/publications/fhwaop02084/

Members of a turn restriction relation are ways and nodes

One simple case can be a turn restriction relation that consists of three members – two ways and one node. The two ways would represent the beginning (‘from’ role) and end (‘to’ role) of the turn restriction. The node would represent the continuity of travel between two ways and has a ‘via’role.

Way (A) - node (B) - way (C) sequence
Way (A) – node (B) – way (C) sequence in a ‘no_left_turn’ restriction relation.

Another case is where a turn restriction relation can consist of three or more ways. Two ways from this type of relation would represent the beginning and end of the turn restriction and at least one way would represent the continuity of travel between the aforementioned ways (‘via’ role).

Way (A) - way (B) - way (C) sequence in a no_u_turn restriction relation
Way (A) – way (B) – way (C) sequence in a ‘no_u_turn’ restriction relation.

Workflow for adding turn restrictions

The traditional way

Using the embedded relation editor available in JOSM. A slight disadvantage of this method is that you spend a bit more time to manually construct the relation. Click on the image below for how-to video.

traditional_way_vid

The user-friendly way

Using the turn restrictions plugin, that automatically recognizes the type of relation and roles for each member. Click on the image below for how-to video.

user_friendly_vid

Using the aforementioned tools, we have reviewed 2,000 miles of field trip footage and added nearly 2,500 turn restrictions in the LA/Orange county area, where 85% of the turn restrictions that were added to the map are no_u_turns, followed by 11% of no_left_turns, the rest being covered by the other categories.

Hopefully we’ve managed to illustrate how easy is to map turn restrictions in OSM. Now, it’s your turn!

Facebooktwittergoogle_plus

How we imported Administrative Boundaries for Mexico from INEGI

The INEGI boundaries import project is focused on importing the data of the national, state, municipal and sub-municipal level divisions present in the MGN published by the INEGI in a community monitored process.

One of the current problems in OSM regarding Mexico’s data is the incompleteness of the administrative boundaries for municipalities. Municipalities are the second-level administrative division in Mexico, the first being the state. There are 2456 municipalities, including the ones in Mexico City which are also a second-level division just with a different name – delegations.

The main goal of this process is to enhance the current OSM administrative division coverage of Mexico with open data made available by the government at the end of 2014.

Import Process

The following steps describe the entire workflow we followed to import the boundary data.

  • Step 0 – Reprojection of INEGI dataset

Before any other step, the data released by INEGI has to be reprojected to WGS84 (EPSG:4326), from ITRF92, using QGIS, and saved as .shp file. An important thing to mention is that no simplification of the boundary geometries is considered whatsoever for this or any of the subsequent steps since the geometries are official government data.

  • Step 1 – Conversion to OSM data

Download the state boundary of interest relation from OSM and save it as an .osm file. In QGIS, using Vector > Research Tools > Select by Location, select the INEGI municipalities boundaries that are within the area of interest, in this case Quintana Roo state, and export the selection as .shp file.

Municipalities in Quintana Roo state.
Municipalities in Quintana Roo, as polygons.

The exported features will be polygons. In order to process them, they must be converted to lines in QGIS using the Polygons to Lines option, available in Vector > Geometry tools. Visually, the output will look the same as when the municipalities were polygons.

Municipalities in Quintana Roo, as lines.
Municipalities in Quintana Roo, as lines.

Using ogr2osm the .shp file containing the boundaries as lines is converted into an .osm file.

Before moving forward, the resulting .osm file has to be modified a bit. Using Notepad++, open the file and search and replace <nd ref=’ with <nd ref=’- and <node id=’  with <node id=’-, so the file will be with negative id.

The negative id is important because JOSM will know that this is new data, not yet added to the map.

Next, the .osm file can be converted to an .osm.pbf file using osmosis.

  • Step 2 – Processing

We load the .osm.pbf file from the previous step into an internal tool, called Mexico Split. The tool is designed to eliminate duplicate/overlapping ways by detaching them from their parent polygons and replacing them with a single common way of the two involved polygons.

mxsplit3
Detects overlapping ways and replaces them with a single common way.

Besides this main purpose, the tool also splits any resulting ways longer than 2000 segments in shorter ways, groups the ways in relationships according to the borders they define and adds some predefined tags to these ways and relations.

Tags added to relations:

type=boundary

INEGI:MUNID=<value_from_the_original_polygon>

name=<value_from_the_original_polygon>

Tags added to both ways and relations:

boundary=administrative

admin_level=6

source=INEGI, MGN 2014 v6.2

For example, data for Bacalar municipality contains the following information:

Bacalar municipality in Quintana Roo state.
Bacalar municipality in Quintana Roo state. (click for larger image)
  • Step 3 – Backup and metrics of existing OSM data

We took a backup of the current OSM data previous to the import of the regions that are going to be impacted, using Overpass API. Also, tag related metrics have been recorded – source, population, admin_center, admin_label, wikipedia etc. in order to have an overview of the newly added information.

  • Step 4 – Delete existing data from OSM and upload fresh data

In some cases the states already have some information regarding municipality boundaries (admin_level=6). These will be deleted, but before deletion we take a look at all the features and relations, to have a very good image of what we should put back in map data after the import.

Next, we upload the municipalities on a state by state basis.

  • Step 5 – Clean/verify the newly added data

This is a very important step because we verify the data that we’ve uploaded to make sure that there are no errors and manually re-link the admin_level=6 relations to the admin_level=4 boundaries, where required. Any other manual corrections are done at this step.

output_mSaeg1
An example of the newly added municipalities boundaries in Tabasco.

To ease the process of importing the municipality boundaries, we use the Mexico Import Map paint style for JOSM. It highlights the last node of every way, making it simple to see the length of every way.

Map styles - JOSM default vs. Mexico Import
Map styles – JOSM default vs. Mexico Import. (click for larger image)

The square node also has a certain degree of transparency, so we can see if there is a node under the node. To be able to work in a systematic way, it allows to quickly see duplicated nodes and see the difference between the admin_level=4 and admin_level=6.

Facebooktwittergoogle_plus