Social Engineering in OpenStreetMap


With the use, popularity and economic value of OpenStreetMap increasing significantly over the last years interests and attempts in influencing the direction of the project and its participants also increased a lot. I here want to look a bit at how this works, sometimes in unexpected ways.

I use the term social engineering here in the sense of activities that aim to make people do (or not do) certain things not by educating them and enabling them to make better decisions according to their own priorities but by influencing their perception to serve certain goals without being aware of this. Some might consider this to be defensible or even desirable if it serves an ulterior motive but i would here take a strictly humanistic viewpoint.

Note social engineering does not necessarily require those who actually influence others to be aware of the reasons.

OpenStreetMap is well known among digital social projects and internet communities to have relatively few firm rules and giving its members a lot of freedom how to work within the project. This also provides a lot of room for social engineering of course. On the other hand the OpenStreetMap community is fairly diverse at least in some aspects and quite connected so it is rather difficult to target a specific group of people without others also becoming aware of the activities. This means classical covert social engineering where people are not aware they are being engineered is not that dominant.

But there are a lot of activities in OpenStreetMap that can be considered more or less open social engineering attempting to influence or organize mapping activities. Humanitarian mapping is one of the most iconic examples for this and there are also quite a number of widely used tools like Maproulette that can be used to support such activities.

The number of people mapping in OpenStreetMap on a regular basis makes influencing them to focus on mapping certain things fairly attractive even on a relatively small scale. But this is relatively harmless because

  • it is fairly direct,
  • the influence and often even the interests behind it can usually be readily seen by the mapper,
  • if it goes over the top such activity can quite easily be shut down or moderated by the community.

In other words: Mappers engaging in a HOT project are not that deeply manipulated because they do not believe they participate in such activity for reasons that are fundamentally different from the actual reasons. They might not know exactly how much the people in the area they map in profit from what they do and what economic interests the organization planning the mapping activity has exactly but in broad strokes they still make an informed decision to participate.

None the less such activities are not without problems in OpenStreetMap, especially since they can affect the social balance in the project. Local mappers mapping their environment for example can often feel bullied or patronized if organized mapping activities from abroad start in their area.

This is however not primarily what i want to discuss here. I want to focus more on a more subtle form of social engineering i would call social self engineering. A good example to show how this works in OpenStreetMap is what we call mapping for the renderer.

Mapping for the renderer in its simplest form occurs when people enter data in the OSM database not because it represents an observation on the ground but to achieve a certain result in a map. Examples include

  • strings being entered into name tags of features that are not names in an attempt to place labels.
  • place nodes being moved so a label appears at a more appealing position.
  • classifications of places or roads being inflated to make them appear earlier or more prominently in maps.
  • tags being omitted from features because their appearance in the map is considered ugly.

Compared to normal social engineering the roles are kind of reversed here. The one whose behavior is changed is the one who actually makes the decision (therefore self engineering) and the influence to do that comes from someone (the designer of the map) who is often not even aware that this might happen and is usually not really happy about this being the case.

This simple form of mapping for the renderer is widespread and those who do this – while they usually know they are doing something that is not quite right – are usually not fully aware of why they are motivated to do so and what consequences this has in terms of data quality. In most cases they simply consider this a kind of shortcut or procedural cheating. The specific problems of the whole field of interaction between map designers and mappers by the way is something i have discussed in more depth before.

There is another variant of mapping for the renderer (or more generally: mapping with specific consideration for a data user) that is less direct that i would call preemptive mapping for the renderer. A good example for this is the popular is_in tag (and variants of it like is_in:country) which indicate the country (or other entity) for example a certain town or other place is located in. I am not aware of any application that depends on this tag to work properly. Taginfo lists Nominatim as the only application actually using this. The very idea that it makes sense in a spatial database to manually tag the fact that a certain geometry is located within another geometry is preposterous. Still there are more than 2 million objects with this tag in the OSM database.

Why this happens has a lot to do with cargo cult. In fact quite a lot of tagging ideas and proposals developed and communicated in OSM can largely be classified as cargo cult based and this is one of the reasons why many mappers look down on tagging discussions. The very idea that any desire to document an observation on the ground in OSM needs to go through some universal classification system is inherently prone to wishful thinking. Sometimes a sophisticated structured tagging system is developed to make it attractive for developers to implement which luckily often ensures it is neither used by mappers nor data users. The idea of an importance tag that re-surfaces every few months somewhere falls into the same category. Out of the desire to have an objective and universal measure of importance for things people invent an importance tag and hope the mere existence of this tag will actually produce such a measure.

But not all of such mapping ideas are non-successful. We also have quite a few tags that were invented because someone thought it would make it easier for data users to have this and where mappers keep investing a lot of time to actually tag this – like the mentioned is_in. Or the idea to map things as polygons that could just as accurately be mapped as linear geometries or nodes – like here.

The problem about this is not only the waste of mapping resources, it also sometimes encourages data users to not invest into interpreting more sensible mapping methods. Preemptive mapping for the renderer – even if based on considerations that make some sense – always aims for technologically conservative data interpretation. This way it hampers actual innovation and investment in intelligent and sophisticated interpretation of mapper centered tagging and mapping methods. The is_in tag for example was invented back in the early days of OpenStreetMap where there were no boundary relations that could be used to automatically check where a place is located. So instead of inventing such a better suited solution for the problem someone took the technologically simple route to put the burden of this on the mapper. Luckily in this case this did not prevent the better solution of boundary relations and algorithmic point-in-polygon tests being developed and established.

And while attempts from data users to directly influence mappers to create data that is easy to interpret for them are often quite easily spotted and rebutted the preemptive variant from side of the mapper is practically often less obvious. And also the motives why a mapper uses or supports a certain problematic tagging are often complicated and unclear.

So if – as a mapper – you want to really support and encourage competent data use better ignore any assumed interests of data users and map as you as a mapper can most efficiently represent your observations on the ground in data form.


  1. I would not put area:highway in the same category as is_in_country. It adds information that can not be produced from already existing data. I am not sure how it rates in term of effort to effects, but I personally added and then immediately used this kind of data (for orienteering competition map).

    • But keep in mind when is_in* was invented it could also not be produced from other existing data and it was useful for data users. This does not make it a good idea though since it is not a good solution for the problem it is trying to address and it puts the burden of work on the mapper rather than the data users. You can see the same being widely the case for polygon mapping of roads.

      The difference is that is_in* is universally a bad idea while polygon mapping of roads can – when combined intelligently with other ideas – be a useful and efficient component in a mapper centric approach to accurate road mapping. Not for >90% of the road kilometers but in parts of the rest. The problem here is that the potential economic gains, the amount of work that can be spared on side of data users and developers of mapping tools is huge. This is why the very concept has and will continue to have a lot of support. Not because it is particularly suitable for solving the problem (giving the mapper a good way to map additional information on roads).

      My own prediction for road polygons is that they will continue to be popular and this will contribute to discouraging the development of alternatives. Ultimately we will likely see that at some point mappers will realize how awkward and inefficient this approach is in a lot of cases and people will develop methods to generate the polygon geometries algorithmically from a more mapper centric data model on the editor level and then retrofit this into the awkward but established polygon data model in the database. And we will look back and say “if we only had used a more intelligent approach originally things would be much easier”.

      • I imagine that mapping explicit polygons can be replaced by width for cases of segments with constant width, but is there anything better for other cases?

        • There are lots of options for parametric and implicit representation of geometric information on roads. Not all of them would be easy to implement in our current OSM data model. In particular the lack of options to assign information to ways on a per node basis (in a similar way as roles for relation members) creates some difficulty.

          If you look at a typical diamond interchange on a motorway – beyond the centerlines this can probably be very accurately described with just about a dozen parameters which is just a tiny fraction of the data (probably less than one percent) that you would need for a similarly accurate representation as a polygon.

          • You really think a typical diamond interchange could be described in less than one percent of the data as a polygon? I’m a little skeptical, do you have any comparison examples?

  2. Interesting thoughts but this is not about social engineering, a term heavily laden with intentional deception. First you state that humanitarian mappers and maproulette users willingnly know what they are doing. So that’s not social engineering. Then you talk about many aspects of the tagging process — which are complicated and may not work in ways anyone is completely satisfied with, but is not about deception. I think you are confusing social engineering with the simply the design of any social process. As OSM is a social process, which people enter willingly, OSM itself is simply a design process, like humanitarian mapping, like tagging, no different from most things in our society where groups of people try to achieve things together.

    • I gave my definition of social engineering in the second paragraph. I am not sure if you disagree with that definition or if you think the practical cases i discuss do not fall under this definition.

      You are right that social processes inherently include attempts to influence the opinions and actions of others – but there is a difference between cases where this happens based on arguments and reasoning and situations where for example information is selectively presented or where information is withheld to create a certain impression. This is of course a gradual range with several dimensions and only at the extreme end you have deliberate misinformation and deception. This certainly also happens in the OSM context but is fairly rare because of the mechanisms i discussed. If you want to draw a line here somewhere and name things on both sides of the line differently that would be fine with me but i do not feel we have suitable terms to make this distinction.

      And in the specific case of social self engineering it should be obvious that self deception is of course normally not intentional.

Leave a Reply

Required fields are marked *.