Innovation and flexibility in the OpenStreetMap project

July 31, 2015 by chris | 0 comments

One of most attractive aspects of the OpenStreetMap project for contributors is probably that anyone can enter anything into the OpenStreetMap database as long as it is observable and verifiable on the ground. What makes this appealing is that there is no agenda that serves certain interests a contributor has to subscribe to.

Incidently this very same principle also usually makes cartographers and geographers with a conventional background quite skeptical about the project, they often think this can’t work and – as i will show in the following – they are not entirely wrong about that. I will here take a critical look at this basic promise of openness of the project and the mechanisms that limit this principle in reality.

The freedom to map everything

As it is frequently said – a freedom is only of use if it is actually exercised so lets have a look at the question if OSM mappers actually make use of the freedom to map anything verifiable. At the first glance this seems to be the case, according to taginfo there are more than 50000 keys and more than 70 million different tags and even if you consider the many individual values for name tags etc. and take into account the countless typos and errors and plain nonsense data that exist in the database there is still an overwhelming number of different and meaningful things that are mapped in OpenStreetMap. But that is not the end of the story.

OSMs most polular tags from taginfo

Because you also need to look at things from a different angle, that is: what does the large mass of data in OpenStreetMap actually represent. Looking again at taginfo you can find out that more than 85% of the ways in the OSM database have one of the following 20 primary tags:

building=yes highway=residential highway=service building=house highway=track waterway=stream highway=unclassified natural=water highway=footway highway=tertiary highway=path building=residential natural=wood landuse=forest landuse=residential highway=secondary landuse=grass highway=primary landuse=farmland building=garage

(You might ask why i looked at ways and not nodes here – the answer is nodes are much more difficult because the vast majority of the nodes have no tags at all and of the ~100 million tagged nodes about half have a source or created_by tag. This makes meaningful analysis more difficult. Overall the largest part of meaningfully tagged nodes are address nodes with addr:* tags.)

There is an old saying that the street in OpenStreetMap is very misleading since the project is not primarily about mapping streets. That is correct, calling it open street/building/address map would be much more accurate.

Of course the mentioned 85% do not represent the actual prevalence of such features in reality. But this is completely natural, people do not indiscriminately map what they see if they are free to map anything they like, they map what they consider important. People map the buildings they live and work in and they map the roads they use to move around. But again this is not the whole story.

What motivates mappers

Mappers are not only motivated by their own genuine considerations what is important for them, they also participate in OSM because they want to share their knowledge with others. And this sharing and communication happens through rendered maps. I am focusing on maps here although there are other uses of OSM data of course, especially routing and geocoding, that play a role. But maps are still the most important application and as you can see above the 20 tags above and many of the other widely used ones are actually widely used for rendering maps.

Maps – and this is widely known and frequently discussed within the OSM community – have a huge influence on the inner dynamics of the OpenStreetMap project. Most often this is discussed with regards to the OSM standard style that is featured on the OSM website and that for most involved in the OSM project is the primary visualization of the database content. The two things that influenced me most to write this text were also directly related to this map style. The first was the suggestion my Andy Allan that for new features to be added to the standard map style they should already be rendered in another OSM map first. The second thing was that a change in mapping practice that was introduced about 2.5 years ago – the way the Antarctic inland ice is represented in the OSM database – is now finally scheduled to be included and properly rendered in the standard map style. Another thing that played in here was the recent discussion on the merits and ethics of remote mapping in OpenStreetMap.

None of these things are big events or questions of fundamental importance but they helped me better understand the inner workings and dynamics of the OSM project.

Community mapping in the OpenStreetMap project and rendering of OSM based maps form a closed loop system frequently called the mapper feedback loop. Often this is considered primarily for the standard style but the various OSM popular map styles are actually remarkably similar in what data they show and how they show it and at what zoom levels. Of course there are specialized maps and map overlays but the base maps that depict the vast majority of the data (the 85% above plus another at least 10% of the most frequent tags) are very close. The standard style can be considered a kind of umbrella style for other general purpose OSM maps – very few things are shown in other popular maps that are not in the standard style. In light of this Andy’s suggestion is quite reasonable to avoid this style getting too much ahead of the rest of the crowd.

Mappers use the rendered maps to verify their work and adjust their mapping priorities and tagging to accommodate these maps. Map designers adjust their styles to better show what people map. In most cases this is working nicely and in theory this gives the ability to react with flexibility to changes in priorities and in the way things are represented in the database. But there is a real risk of this feedback loop really becoming a closed loop decoupled from reality if either side (mapper or map designer) puts more attention on adjusting to the other party than to the reality they are trying to map or depict. The sad thing is that the mapper feedback loop only works by both mappers and map designers adjusting their work style to each other although in a way it would be better if they’d ignore each other.

This problem is directly related to another one i would like to call the dilemma of community mapping: Mappers are usually motivated by the idea to share their knowledge about the world and to learn from others sharing their knowledge but if every mapper is focused on the same things nobody really learns something new. Both these mechanisms together most likely mean that within the dynamics of the OpenStreetMap project there is theoretically a stable equilibrium where the OSM community would consist of people with very limited interests (think the above 20 tags plus maybe 50 more primary tags and something like a hundred supplementary ones) and map designers are happily competing to create the nicest looking representation of this limited set of data. This scenario is actually much closer to what is common in conventional cartography – just that the selection of features mapped would not be based on a systematic upfront choice but the result of a convergent evolutionary process.

Now i am not saying the OSM community is at this point but it has to be careful in my opinion not to get into this because ultimately the above scenario is not unattractive to many – both within the OSM community and outside among those just using the data. Note in this scenario mappers are still free to map and tag whatever they want but the chances that their work turns up in a widely used map are slim unless they map something of very broad interest and this is also done by lots of other mappers. With the OSM project maturing this gets rarer.

Maintaining freedom

Specific suggestions how to avoid this are difficult but being aware of this risk is certainly a big advantage. There are two things i would propose that could help a lot:

More diversity in map rendering: I already explained that existing maps are fairly homogeneous – not in superficial styling but in the underlying principles and choices. It would be good to break this up more and establish more maps that take a very different approach to the OSM database, render different things in different ways. There are a few examples of local map rendering projects that go in this direction, for example the topographic map by maxbe demonstrating prominent use of areas tagged place=region + region:type=mountain_area as well as preprocessing of mountain peaks and saddles for better display. Note i am not talking about specialized overlays with POIs or routes here – these are important as well but they are already available for a large variety of purposes. I mean innovative new base maps integrating new OSM data that is not generally rendered in other maps and taking new approaches to interpreting the large volume data (like in the tag list above). The streamlined conservative base maps we have are designed with a very narrow cultural mindset and it would be refreshing to see alternatives that show the world from more diverse cultural perspectives.
new approaches to basic map rendering in maxbe’s topographic map

What especially pains me here is that none of the bigger companies working with OSM data seem to be making any efforts in that direction. They are successfully addressing the big and homogeneous markets of Central Europe and North America on the short term but in terms of map design beyond the mere technical level there seems to be next to nothing from them in direction of strategic research. If at all innovative developments in that area usually come from individuals or small firms.
Establishing a middle layer between mapping and rendering: Many of you will be thinking vector tiles here but this is not what i mean. Vector tiles are a technical middle layer for streamlining data processing. What i mean is a semantic middle layer that facilitates a broader reinterpretation of what is mapped than what is generally possible and done in map styles already. Above i talked about the freedom to map and tag everything. This freedom generally applies to mapping new things that have not been previously mapped by others. For things that are already mapped elsewhere mappers are generally expected to use existing tags. But the problem is that with the differentiated mapping often done today this frequently does not work well. Take the different landuse and landcover classes for example – those are developed primarily from a European, often even more specific British perspective. While you probably can map landuse in the Middle East or in Central Asia using these classes it is often not very intuitive for locals there to do so and this frequently leads to strange results.
When none of the common bins fit. wood? scrub? bare_rock?

For documenting reality in the OSM database in the most precise and convenient way for the mapper it would make sense to allow mappers to differentiate landuse and landcover in ways that are important for them and not in a system that works for people far away in Europe. Currently this does not happen since there is both direct pressure from fellow mappers to use existing tags and indirect pressure to use them if you want your mapping to be visible in some form. A middle layer would here serve to establish similarities between different local taggings that allow style designers to use common styling for things based on these similarities but on a per case basis for rendering and not already during mapping as it is enforced now. The rules for such a middle layer could well be created and maintained by the OSM community since writing them requires much less special knowledge and expertise than designing maps.

Many will probably say this won’t work – a clear, universally valid tagging system is the key to a working OSM community. I won’t categorically deny that – there are serious advantages of having universally used tags but i think it is also beyond doubt that there are many cases among the top 100 OSM tags where the universal tagging does not work. For highways and administrative boundaries this is generally accepted and there are also other pages on the wiki documenting the local differences in tagging – however this is currently not used in any maps to differentiate rendering as far as i know.

Note tag reinterpretation is just one task for such a middle layer. There are also differences in how things are mapped geometrically, for example features mapped as polygons vs. features mapped as lines. Combining both usually works badly in current map rendering systems and a middle layer could help here although geometry processing comes with more serious performance implications of course than mere tag reinterpretation.

Imagico.de

blog

Innovation and flexibility in the OpenStreetMap project

The freedom to map everything

What motivates mappers

Maintaining freedom

Leave a Reply Cancel reply