Working with ImproveOSM Data Dumps

Our ImproveOSM pipeline produces a pretty impressive number of suggested roads missing from OSM, missing oneway tags, and missing turn restrictions, based on analysis of billions of GPS data points. We make the results available as frequent data dumps in CSV format. In this post, I want to look at a way to integrate this data into your OSM mapping workflow.

If you just want to see ImproveOSM data in JOSM wherever you are currently mapping, you can just use the ImproveOSM JOSM plugin. For advanced users who want more flexibility, or who want to use this data in different ways, this post offers some guidance.

The data dumps are available from here. For this example, I will work with the most recent Direction of Flow data file. This highlights ways with potential missing oneway tag. After downloading and unzipping it, you will have a CSV file of about 16.5 megabytes that looks like this:

148617028;1867720648;89191396;99.5378927911275;SOLVED;THROUGHWAY;LINESTRING(2.217821 48.922613,2.217719 48.922618,2.217408 48.922633);1082
33555379;322840377;322840383;98.6301369863014;INVALID;LOCAL_ROAD;LINESTRING(4.999815 47.34294,4.999957 47.343062,4.999965 47.34315);146
17271190;178942503;2341050872;100;OPEN;LOCAL_ROAD;LINESTRING(11.070503 50.139245,11.070525 50.139213,11.070616 50.139099,11.070693 50.139032);74

Since the theGeom field is in WKT, you can import it as a layer in QGIS pretty easily. Let’s fire up QGIS (I use 2.18) and add a Delimited Text layer.

In the dialog, select the downloaded CSV file as the file source. Set the delimiter to semicolon. QGIS detected for me that the geometry was in the theGeom field, and of type WKT, but you can set that manually if needed:

Upon clicking OK, QGIS wants us to define which CRS the coordinates are defined in. Select WGS84.

Now, we have a layer of line geometries that correspond to OSM ways that may be missing a oneway tag.

To make the file more manageable, let’s limit our selection to one country. I get country boundaries from Natural Earth (a fantastic resource!). After adding the country borders to QGIS, I can perform a spatial query. Before you do this, select the country you are interested in. I pick Mexico as an example.

Bring up the Spatial Query window. If you don’t see this menu item, you will need to enable the Spatial Query plugin.

Select the ImproveOSM layer as the source, and the Natural Earth layer as the query layer. Make sure to check the ‘1 Selected geometries’ checkbox, so we limit our query to Mexico.

The matching features will now be selected in the ImproveOSM layer. Make sure that layer is selected in the Layers Panel before you select Layer -> Save As.. from the QGIS menu. In this dialog, choose GeoJSON as the output type. Select a destination filename. Make sure that the CRS is set to WGS84. Make sure the ‘Save only selected features’ is checked, and Save.

Now you have a GeoJSON file with all OSM way geometries that may need a oneway tag. You can load this file into JOSM, using its GeoJSON plugin. To organize your work going through these, I would recommend using the Todo plugin and add the GeoJSON features to the todo list.


Detecting Traffic Signs in OpenStreetCam

OpenStreetCam’s mission is to help you improve OSM with street-view imagery. Photos taken with regular smartphones seem to be good enough for capturing map features like traffic signs, lanes or crosswalks. However, browsing the 120 million+ photos in OSC to find relevant things to map will take a while. The human factor is fundamental to OSM’s culture and we don’t see that changing, but we want to make editing street related attributes more efficient with automation.

We’re happy to announce a beta release of the traffic signs recognition on OpenStreetCam photos, made possible with machine learning. We processed a few million photos and detected around 500.000 traffic signs so far, currently available for tracks in several areas in United States and Canada. We’re working on extending the training sets and optimize the processing so that the area’s soon expanded.

What’s new from a user perspective: the track page on will now show detected traffic signs when available:

There’s a preview list of all detections in the track, detection overlays on photos and, of course, filters. Filters might now get a rep as something really exciting, but we’re excited about one of ours: the OSM status. Here’s why: after detecting a sign we compare it to the corresponding OSM feature and check if they’re consistent. Based on that, filtering is available.

For a practical example, let’s take speed limits: Instead of manually cross checking every detection with the maxspeed tag in OSM, one can only review detections where presumably maxspeed is not set or the value’s different in OSM. Just tick the Need review in OSM box.

Here are a few more examples of trips that have already been processed with our sign detections.

What’s next?

We’re busy working on a few things:

  • Scale the training sets and pipeline to extend the supported areas.
  • Traffic signs integration in the JOSM plugin.
  • Tagging new traffic signs support in the webpage.

If you like what we do and want to help:

  • First and foremost, you can use detections to improve OSM. If you’re seeing detections on tracks check them out, see what needs reviewing in OSM and edit. You can open iD or JOSM to photo’s location straight from the webpage.
  • Help us improve the traffic signs recognition. There’s a chance you will find some bad detections. You can review them and flag whether they’re good or bad, see the two buttons above the photo. We’re adding those reviews to training sets to improve recognitions, so please play nice.
  • Help us add these detections to the iD editor as well.

Tip: you can navigate between detections with Ctrl/Cmd + right/left arrows and confirm/invalidate with Ctrl/Cmd + up/down arrows. Goes pretty fast.


Making OSM navigation ready in Phoenix, AZ

GPS technology is growing fast along with the mapping industry. There are a lot of navigation apps on the market, trying to step forward using different technologies and map data. As you know, behind all its routing processes and algorithms, a good routing application must have a good quality map.

Our team, Telenav, is developing an Open-Street-Map (OSM) based app, called Scout. Why Open-Street-Map? In my opinion OSM is a good choice because it’s an open, fast growing and well built-up community map. With its great involvement and commitment, the community is constantly trying to keep all the map features from OSM up to date.

For the purpose mentioned above, Telenav decided to improve the quality of the OSM map features, used for navigation, in Phoenix, Arizona.

From September until now, the whole project workflow was realized in 3 main steps:

Step #1. Research and get in touch with local community (research on local open source data, keep in touch with local OSM users, official traffic signs and driving rules, research on HOV and toll roads, roads under construction, other specific road features of Arizona State).

Step #2. Build-up process:

  1. Road geometry, Road Name, One Way, Gates and Other Geometric Feature (turning circles, turning loops, etc.)
  2. Signpost
  3. Turn Restrictions and Traffic Lights
  4. Lane and Turn Lane Info, including HOV Lanes and Toll Roads
  5. Speed Limit Editing

Step #3. Quality Assurance (QA for every feature edited by our team in OSM, some road name special issues, repairing broken relations, final edits and corrections based on Improve OSM plug-in, Osmose and Keep Right errors, tile-by-tile verification).

The first step was a very important one because the team had to get in touch with the local community and mapping rules. Also, we researched other open source and free of charge data beside the already well-known sources used by OSM users (TIGER Roads, Bing, Digital Globe, Open-Street-Cam, Mapillary). The under-construction roads found in this step were permanently monitored, to be edited later.

The second step was the most time consuming and it included the editing and reviewing of all the most important navigable OSM features, starting from motorway, trunk, primary, secondary, tertiary, residential and service ways. The whole editing process lasted for about 3 moths and the workflow was mainly based on a tile-by-tile method. We also edited based on way category or route reference. During this period, we helped the community to increase the OSM quality as you can see below:

The third step was the last but not the least important. Because we wanted to have good quality features in OSM, we had to make a closure check on our edits, already made in the build-up process and where it was necessary, to fill the gaps. So, we used different QA procedures, queries, other plug-ins, error identifiers and even tile by tile verification.

For the queries based QA, we ran some predefined scripts based on OSM datasets, using pgAdmin and PostgreSQL. The queries were aimed mainly at finding:

  • Wrong one-way direction
  • Wrong number of lanes
  • Untagged roads
  • Long relation members
  • Turn Restrictions with unusual number of members
  • Ramp has name or reference number
  • Similar names and destinations
  • Detect possible roundabouts, etc.

The main JOSM plug-in used to complete the missing roads was Telenav ImproveOSM (also available on The plug-in focuses especially on missing roads, one ways and turn restrictions.

Also, a good source of errors to be checked can be found on these error identifier sites: and

Doing the QA, we had an important overview of all our edits, especially of our wrong ones. In this step, we resolved also a name tag issue we’d came across during step #2. Step #3 represents confidence of a good quality editing process for the Telenav team.

When we needed a double check of our work, the community was a good help, giving us important feedback. For the next months, we’ll surely keep in touch continuously with the local OSM users, looking forward for their feedback to keep up a good maintenance work. The top editing users were the most helpful users, giving us precious and pertinent feedbacks.

In the GIF below you can see an evolution of Telenav mapping team edits during the last moths, in Phoenix, Arizona. Darker colors indicate high density and brighter colors low density.


Turn restrictions editing in Phoenix, AZ

Looking from a map analyst point of view, turn restrictions are some of the most important features a map can have. Turn restrictions influence a lot the way a route is made. If they are wrongly edited, they can cause bad routing, having big consequences on travel time, travel directions, maneuvers and so on. That’s why we decided to talk a little bit more about this case: editing turn restrictions.

We worked on this issue for 3 months, starting from November 2017. We succeeded to review a large amount of intersections between motorway, trunk, primary, secondary, tertiary, residential and service ways from Phoenix area.

During our project we accomplished:

  • to add new turn restrictions
  • to correct some of the damaged ones
  • to remove some of the wrong ones.

The main sources of adding or editing turn restrictions were open-source:

  • Satellite imagery: Digital Globe, Bing, Esri World Imagery
  • Street level imagery: Open-Street-Cam (OSC), Mapillary.

The software used in editing OSM was JOSM, an extensible editor for Open-Street-Map for Java 8. Also for a better visualization we used different Map Paint Styles (MapCSS) and thematic Layers (Open-Street-Cam, Mapillary, Mapillary object layer).

With a self-developed procedure, we identified all that could be a turn restriction sign in all the OSC track photos, overlapping our area. In the first part of our project we looked over 1723 turn restriction signs identified in Open-Street-Cam, spatially distributed as in map below. Before adding any turn restriction, every detection or sign found in Open-Street-Cam was first validated based on the open sources first mentioned.

In the second part, we managed to identify turn restriction signs also based on Mapillary Imagery and Mapillary Object Layer.

All the reviewed turn restriction signs summed in the end 3289, from which 350 turned to be missing from Open-Street-Map. During this task we also reviewed 11932 and added 376 new traffic lights.

The final step for our task was to do some QA testing. Firstly, we used a query based QA to verify the quality of all turn restrictions from OSM Phoenix area. For this, we ran some predefined scripts based on OSM datasets, using pgAdmin.

The queries aimed mainly:

  • Long relation members
  • Turn Restrictions with unusual number of members
  • “Odd” tagging in existing turn restrictions
  • Conditional turn restrictions with old tagging scheme

Example of query used to identify unusual turn restrictions from OSM:

We wanted to make sure we’re covered all turn restrictions. So, we used the Telenav ImproveOSM plug-in in JOSM. The plug-in highlights the possible turn restrictions locations based on road geometries and other probe data.

These kind of errors identified by us during the query based QA can also be found on and sites. These two sites contain datasets with all kinds of errors from OSM. The datasets were downloaded, extracted using some Python scripts, clipped after our bounding box and filtered using some SQL queries in pgAdmin. The results were exported and compared with the initial errors resulted in the first QA step and where it was necessary, the turn restrictions were modified or corrected.

The heat map below presents all the turn restrictions edited by the Telenav team, from November until now, in Phoenix. Darker colors indicate high density and brighter colors low density.


New Features and Enhancements in Cygnus+

Cygnus is the Telenav Mapping conflation tool. We use it a lot internally to compare approved external data sources with existing OSM data, but there is also a public version. We outlined how it works in an earlier blog post. In this post, I want to highlight some of the newer features in Cygnus. These new features are based on the feedback from our team of Map Analysts, who use the tool in their day to day work.

Discarding Very Short Segments

Cygnus outputs the differences in geometry between existing OSM data and the spatial data that we want to use to improve OSM. Sometimes, when the differences are very tiny, Cygnus used to export very short ways. These are not really meaningful enhancements, and clutter up the result data. Therefore, we implemented a length filter. Ways shorter than a defined length threshold will not be included in the output. Based on experience, we set the default to 5 meters. In the internal (command line) version our team uses, this can be tweaked using a parameter. In the public web version, this is not yet possible. We can consider adding it if there is sufficient demand.

An example of Cygnus in action. It finds an opportunity for improvement (possibly incorrect street name) as well as a false positive (degraded road geometry)

Road Names

When comparing road geometry, Cygnus not only compares geometry, but also road names. An annoying side effect we noticed is that road names are often not exactly the same in OSM as they are in the external data we compare with. This does not mean that the external data is necessarily better. For example, OSM could say that the name of a road is “River Road”, and the external data source could say it is “River Rd”. This is not a meaningful difference, and we would want to exclude those in most cases. So we added a string distance based  threshold in Cygnus to filter out similar strings. It is set to a sensible default which, again, can be tweaked in the command line version we use internally, but not yet in the web version.

Another Cygnus improvement related to road names is to ignore name differences on certain types of ways: roundabouts and service roads. Roundabout ways in OSM do not have names by convention, unless the roundabout itself has a name, so they should generally not be added. Service roads technically can have names in OSM, but it is not common. In external data, they do sometimes have names, but if they do, it usually does not make sense to add them to OSM. Based on our experience, they often have descriptive names like ‘driveway’ or ‘access road’ in the source data.

Using Cygnus

You can use Cygnus yourself by going to and uploading your source data file. You need to do a fair amount of work to prepare the source data: translating the source attributes into valid OSM tags, and converting to OSM PBF. And always remember to consider carefully what you do with the result. Cygnus is not designed to be an automated import tool. Every suggested change should be manually reviewed.

Let us know how you have used, or would like to use Cygnus!


Cygnus – conflation at your fingertips!

This is a follow-up blogpost after the State Of the Map US 2017 conference held in Denver.

The process of conflation in GIS is defined as the act of merging two data layers to create one layer containing the features and attributes of both original layers.

Cygnus is a tool that compares external data with OSM, giving you a result file in JOSM XML format with all the changes. The comparison is made in a non-destructive way, so no OSM ways are ever deleted or degraded.


NOTE – The license compatibility between the local data file and OSM has to be taken into account before adding anything in OSM. Also, please follow the OSM import procedures if you are planning to add external data to OSM.

First of all you need to have a shapefile with local data in WGS84 spatial reference. This shapefile has to be filtered in different ways, depending on the tags you want to compare. For example, if you want to compare oneways, make sure to have a flow-direction/oneway/etc. attribute in the shapefile.


The first thing that has to be taken care of is to assure a proper attribute translation. I created a simple example for this exercise. I don’t want to get neck-deep in too many technical details so the main focus remains the process as a whole. I kept the attribute information for this example straightforward:

In order to create an OSM file from this data, I wrote a simple translation file that will be used together with ogr2osm.

Next, run the below command to obtain the OSM file.

python simple_streets.shp -t -o simple_output.osm

Finally, I converted the OSM file to PBF using osmosis, because Cygnus requires a PBF file as input.

Cygnus goes to work!

Now that you have gone through the pre-processing of the local data file, we can offer it to Cygnus for processing. Note that your upload needs to be small-ish – the spatial extent needs to be smaller than 50×50 km and the file needs to be 20MB or smaller in size.

The interface of the Cygnus service is very simple – there are just two pages:

  • the home page where you add new jobs
  • the job queue page where you can see your progress and download the result

If your input file was uploaded successfully, Cygnus will go to work. Your job will be added to the back of the queue. When it’s your turn, Cygnus will read your PBF file, and download the OSM data for the same extent, using Overpass API. It will then compare your upload with the existing OSM data and produce the output file that you can download from the job queue.

NOTE – Everyone’s jobs are listed here, so be careful not to touch other users’ stuff.

Process the output in JOSM

Once Cygnus gives us the output, we can open it in JOSM and inspect it. This is by far the most important, and time consumig, step. Even though Cygnus does a best effort to connect ways where needed, it acts conservatively so it will not snap ways together that do not belong together.

Here are a few ways that got properly connected to the existing highway=secondary:

But there are situations where the distance was too far so Cygnus did not snap:

In this case, you need to manually connect the ways if that is appropriate.

When you are finally satisfied with your manually post-processed conflation result, you can go ahead and merge it with the OSM data and upload it!


Telenav presence in State of the Map Latam 2017

It has been only two years since the Latin American OSM community decided it was necessary to have a regional event. The first event took place in Santiago the Chile and the second one in Sao Paulo, Brazil. At the end of the second edition there were a few places considering to organize the third edition of the event. When I found out Lima was chosen I was very happy. Lima is one of the cities you never get tired to visit and you always discover new things, specially new amazing food 😉

This year Telenav had the chance to be sponsor. Every State of the Map is a gathering of different efforts that combining together keeps improving the data of the largest geospatial database in the world. This sponsor supported the attendance of Juliana Hernández (Peasent mappers- Tools for learn, teach and strengthen cartography) and Daniel Quisbert (Installing and configuring Offline OSM in Linux).  Both conferences where highly appreciated by the attendees.  Also Mapbox and Telenav co-hosted the first anniversary of Geochicas, it was a lovely evening in an historic bar from Lima in the one new geochicas from Perú and Colombia joined the group. We expect more companies collaborating together to reduce the gender gap in OSM. There are many ideas for what Geochicas will be doing in 2018, please check the full information here.

The keynote was given by Philipp Kandal (A journey with OSM from the past into the Future) In the one Philipp shared his experiences over the years with OSM data and how probe data and using machine learning tools the map its being improved in a tremendous way.

From my side I had  the chance to show the mapping efforts in Canada and the preliminary mapping results from Ecuador, a pilot project we are currently developing to improve road data in one LATAM country. By using Satellite imagery and OpenStreetCam data we started improving the OSM data in roads since September this year. Find the slides here. So far the results we have are the following:

Stay tuned for the updated results in the following months!

This is just the beginning of future collaborations Telenav would like to keep doing with Latam communities by providing tools and guidance in previous projects we have done.

Thanks to the organization team and attendees of SOTM LATAM 2017, see you in other place in Latam in 2018!


Fire up the editors: ImproveOSM updated with many new things to fix in OSM

Our OSM team continually processes billions of anonymized GPS traces we receive through the Scout app and partners, in order to discover things potentially wrong or missing in OSM. We call this effort ImproveOSM, and it  is a big part of Telenav’s overall mission to keep making OSM even better.

Missing Roads in Northern Brazil. The denser the GPS point cloud, the more trips and the more likely you are helping people get around more accurately!

Our most recent update to ImproveOSM was a particularly big one. In the last month, we added:

  • 133 thousand missing roads tiles
    • Another 75 thousand tiles that are likely parking areas or tracks
    • Another 670 thousand (!) water tiles (see below)
  • 300 thousand suspected turn restrictions with over 50% high confidence

Using ImproveOSM data

Perhaps you have not looked at ImproveOSM data before. It is available through the ImproveOSM web site, which is based on the iD editor. The screenshots on this page are from that web site. If you know how to edit with iD, you will find it easy to work with ImproveOSM data and use it to edit OSM. We wrote a post that goes into more detail a little while ago.

If you prefer JOSM, we have created an ImproveOSM JOSM plugin as well. it works similar to the web site: you choose what ImproveOSM data you want to see (suspected missing roads, suspected wrong one-way roads, or suspected missing turn restrictions, or all of the above!) and the plugin will show you the ImproveOSM data as a separate layer. We also have a blog post about using the JOSM plugin.

Finally, a few interesting / funny examples of ImproveOSM data around the world.

ImproveOSM data points out that a new road alignment is now in use. Aerial imagery and OSM have not been updated yet. This is in northern Sweden.

Here, we stumble upon an undermapped town north of Surat, India. Of course, there are un- and undermapped areas everywhere in the world, but the ImproveOSM data shows that there are people driving around on these streets using a GPS enabled app or vehicle — people who would benefit from better OSM data in their everyday lives. It is not hard to find places like this around the world.

Finally, an animation showing clusters of ‘water’ tiles. This is a side effect of the partner data we process. Since it’s anonymized there is no way to say anything about why these traces exist. Useful for OSM? Perhaps.. Interesting? I think so!

Are you finding interesting, useful, funny or wrong data in ImproveOSM? Let us know! Happy Mapping!


Is OpenStreetMap Big Data ready?

This article was written by Adrian Bona as a draft for a talk at State of the Map US in Boulder, Colorado this past month. The talk did not make it into the program, but the technology lives on as a central part of our OpenStreetMap technology stack here at Telenav. We will continue to deliver weekly Parquet files of OSM data. Adrian has recently moved on from Telenav, but our OSM team is looking forward to hearing from you about this topic! — Martijn

Getting started with OpenStreetMap at large scale (the entire planet) can be painful. A few years ago we were a bit intrigued to see people waiting hours or even days to get a piece of OSM imported in PostgreSQL on huge machines. But we said OK … this is not Big Data.Meanwhile, we started to work on various geo-spatial analyses involving technologies from a Big Data stack, where OSM was used and we were again intrigued as the regular way to handle the OSM data was to run osmosis over the huge PBF planet file and dump some CSV files for various scenarios. Even if this works, it’s sub-optimal, and so we wrote an OSM converter to a big data friendly columnar format called Parquet.The converter is available at, this will make the valuable work of so many OSM contributors easily available for the Big Data world.

How fast?

Less than a minute for romania-latest.osm.pbf and ~3 hours (on a decent laptop with SSD) for the planet-latest.osm.pbf.

Getting started with Apache Spark and OpenStreetMap

The converter mentioned above takes one file and not only converts the data but also splits it in three files, one for each OSM entity type – each file basically represents a collection of structured data (a table). The schemas of the tables are the following:

 |-- id: long
 |-- version: integer
 |-- timestamp: long
 |-- changeset: long
 |-- uid: integer
 |-- user_sid: string
 |-- tags: array
 |    |-- element: struct
 |    |    |-- key: string
 |    |    |-- value: string
 |-- latitude: double
 |-- longitude: double

 |-- id: long
 |-- version: integer
 |-- timestamp: long
 |-- changeset: long
 |-- uid: integer
 |-- user_sid: string
 |-- tags: array
 |    |-- element: struct
 |    |    |-- key: string
 |    |    |-- value: string
 |-- nodes: array
 |    |-- element: struct
 |    |    |-- index: integer
 |    |    |-- nodeId: long

 |-- id: long
 |-- version: integer
 |-- timestamp: long
 |-- changeset: long
 |-- uid: integer
 |-- user_sid: string
 |-- tags: array
 |    |-- element: struct
 |    |    |-- key: string
 |    |    |-- value: string
 |-- members: array
 |    |-- element: struct
 |    |    |-- id: long
 |    |    |-- role: string
 |    |    |-- type: string

Now, loading the data in Apache Spark becomes extremely convenient:

val nodeDF ="romania-latest.osm.pbf.node.parquet")

val wayDF ="romania-latest.osm.pbf.way.parquet")

val relationDF ="romania-latest.osm.pbf.relation.parquet")

From this point on, the Spark world opens and we could either play around with DataFrames or use the beloved SQL that we all know. Lets consider the following task:

For the most active OSM contributors, highlight the distribution of their work over time.

The DataFrames API solution looks like:

val nodeDF = nodeDF
    .withColumn("created_at", ($"timestamp" / 1000).cast(TimestampType))

val top10Users = nodeDF.groupBy("user_sid")
    .map({ case Row(user_sid: String, _) => user_sid })
nodeDF.filter($"user_sid".in(top10Users: _*))
    .groupBy($"user_sid", year($"created_at").as("year"))

The Spark SQL solution looks like:

    year(created_at)) as year,
    count(*) as node_count
    user_sid in (
        select user_sid from (
                count(*) as c 
            group by 
            order by 
                c desc 
            limit 10
group by 
order by 

Both solutions are equivalent, and give the following results:

alt tag

Even if we touched only a tiny piece of OSM, there is nothing to stop us from analyzing and getting valuable insights from it, in scalable way.

If you are curious about more advanced interaction between OpenStreetMap and Apache Spark, take a look at this databricks notebook.

OpenStreetMap Parquet files for the entire planet?

Telenav is happy to announce weekly releases of OpenStreetMap Parquet files for the entire planet at


New ImproveOSM tiles are ready to be used!

New ImproveOSM missing road tiles are available! The new data is very helpful as they can help you to target the missing roads, add them to OSM and thus greatly improving the map.

Worldwide, there are 113048 new road tiles.  The countries with the highest number of tiles are: Russia – 38669 tiles, United Kingdom – 8890 tiles, Kazakhstan – 10993 tiles, India –  9418 tiles and the United States- 7560 tiles (see graph below). There are few new tiles in Detroit too so that you are welcome to give us a hand with them! You can find more information about our work in Detroit on our blog (