Imagico.de

blog

TagDoc logo

Introducing the TagDoc project

| 11 Comments

This post is also published on TagDoc – you can read it there as well if you prefer.

To many, even to some experienced OSM community members, the free tagging system of OpenStreetMap appears chaotic, unorganized and inefficient. And it is to some extent. However, this system is at the same time also highly functional and fulfills a central role within the project and the OSM community. The alternative to the free, de-centrally developed tagging system we have, would be a centralized, authoritative system of attributes and rules, developed and imposed by some centralized bureaucracy, which would have to be a culturally fairly homogeneous group of people to function as such. In other words: The alternative to the open tagging system, would be for OpenStreetMap to give up the idea of creating and maintaining a map by the people, for the people, based on local knowledge and egalitarian self determined cooperation, and to adopt exactly the kind of methodology centralized mapping authorities use – whose dominance and flaws OpenStreetMap was created to overcome.

One big reason why despite this an increasing number of both data users and mappers are becoming more and more skeptical of free and open tagging, and frequently push with – for example – organized/automated edits or by introducing seemingly authoritative formulations into tagging proposal processes for a more centralized and more authoritative approach to tagging decisions, seems to be the highly dissatisfying status of documentation of the free tagging system we have in OpenStreetMap.

The traditional platform for documenting tagging practice in OpenStreetMap is the OpenStreetMap Wiki. The basic idea is that mappers are free to use any tags they want to use, and they are encouraged to document how they use the tags on the OSM wiki. This has, over the years, led to a highly valuable body of documentation of tags and their use. But – with the tagging documentation on the wiki growing both in size and in importance for the OSM community – it has also become an attractive platform for people who are not just interested in documenting how they use the tags on an equal level with everyone else documenting their tag use, but for the pursuit of various subjective ideas on how the meaning of tags should be presented, irrespective of how the tags are de facto used in the database. As a result, nowadays there is no consensus within the OSM community if the tagging documentation on the wiki is to be descriptive (documenting how tags are practically used) or prescriptive (documenting how tags should be used according to some subjective opinion or some perceived authority). There is not even agreement that these two aspects should be separated. And while – as said – there is a lot of valuable information on the wiki about tags, there is at the same time also a lot of nonsense put there because it suits someone’s agenda. Additionally, the two are often indiscernible and you can only differentiate reliably between them if you have significant up-front knowledge about the tag in question – and in that case you don’t really need the wiki that much.

For data users and software developers, but also for mappers, this is a huge problem because they – due to the lack of alternatives – rely on the wiki as a source of information, a source which, however, is notoriously unreliable. And this is a self emphasizing problem because the more people rely on the wiki, the more attractive it becomes as a vehicle for people with an interest to influence data users, software developers or mappers to introduce ideas into the wiki documentation that do not factually document the de facto meaning of tags, but communicate some subjective ideas on how things should be mapped or how data should be interpreted.

Of course this, in principle, is not a new problem and for a long time people active in OpenStreetMap have discussed various ideas on how to address it. But so far nothing of substance has come out of that. I have also for a long time shied away from working on this – partly due to the sheer size of the task, partly however also because i see a lot of value in the idea of free tagging without centralized rules being combined with documentation of said tagging being developed through equally open cooperation. But more recently i came to realize that the direction the tagging documentation on the OSM wiki is heading, and the dysfunctional social dynamics you can widely observe there, are likely more damaging than supportive for the core idea of OpenStreetMap, especially in light of the fact that the tagging documentation on the OSM wiki is, both de facto and in the perception of people, without alternative. And it has become clear to me that what works for mapping itself – the egalitarian cooperation across language and culture barriers by communicating through the act of mapping itself – does not work likewise for developing a language based documentation of mapping practice.

The idea of TagDoc

What is the solution to this problem? I don’t really know. There might not even be a real solution and OpenStreetMap will need to deal with that. But given how little of substance has been tried to address the problem of reliable and competent documentation of the use of tags in OpenStreetMap, there are ways to substantially improve on that compared to the status quo. The difficulty is how to approach this problem in a way that has a good chance for success. What i have come up with as a concept, after quite a lot of contemplation, is the following:

  • Limiting the scope to documenting the de facto meaning of tags, that is describing how they are actually used in the OSM database. If there is a need for a prescriptive tagging documentation it is probably something that could be discussed but it is beyond doubt, i think, that any serious attempt at developing prescriptive tagging instructions (be that as text or in the form of tagging presets in editors for example) would have as a prerequisite solid knowledge of the de facto meaning of tags as they are used so far. Much of the problem of the current situation stems from the fact that this does not exist.
  • Writing with the aim to be useful and of value for the data user. The reason for that is twofold – First: Because i think data users are in most serious need of a trustworthy and competent documentation of tags in OpenStreetMap and where the dysfunctional nature of the tagging documentation on the OSM wiki creates the biggest issues. Second: While i hope that TagDoc is also of interest for mappers, my idea is that the agreement among mappers on the use of tags should continue to primarily develop through the cooperative mapping process itself (see also What Tags are). The influence of any single other factor on mappers can, if overly large, lead to imbalance and become a problem over time.
  • Written and curated by proven and independent experts, open to scrutiny by the whole community. I know that the idea of meritocracy has fallen out of fashion in significant parts of the OSM community because it is considered to be unjust. And, as indicated, i have myself for a long time valued the open collaboration as a principle for documenting tagging. Over the last years this has however proven not to be able to produce the qualified and reliable documentation needed, at least not in the current and foreseeable future social environment of OpenStreetMap. And the alternative – a committee of political appointees – does not, to put it mildly, have a very good track record in curating intellectual writing.

The plan

As written on the starting page at the moment with me alone writing for TagDoc in my spare time is not in any way sustainable. There are mainly two options how this could be solved:

  • If people (in particular data users) are willing to support this idea and see value in the project as i sketched it above and are therefore willing to finance me working on it, i could extend the amount of time i can invest in this work. Of course this opens up the question of in how far my financiers might influence the content of TagDoc and this way we might solve the problem of the unreliable information on tags from the OSM wiki being the only source of information available, but only with the other source of information being curated by financial interests to their liking. What i can say to that is that (a) i have shown in the past i think on plenty of occasions that with my public writing i tend to not pay much regard to what views are economically opportune, (b) that i would be transparent about where i receive funding for writing and editing for TagDoc from and (c) that any financier of my work on TagDoc would be aware up-front about the basic premises of the project as outlined above which are fundamentally at conflict with the idea of pushing certain subjective views on tagging. What could happen of course is that financiers influence what tags i write about and analyze with priority and in particular detail. That however is already the case right now – what tags i know most about and write about with priority is evidently not independent of what kind of OSM data i work on as part of my paid work. I am not offering this option because i really need additional paid work per se but because i see a strong need for this project in the OSM community and from data users in particular, and i would be willing to reduce other parts of my paid work in favor of this in case there is interest in financing that.
  • If other independent experts are interested in contributing to the project under the premises described above, i would be willing to open the project to other authors. So far i have not put the content of TagDoc under an open license but i would be open to such a step and this evidently would be kind of a prerequisite for turning it into a larger cooperative endeavor of several authors. This could work on various levels – from authors contributing analysis or documentation of a single tag to writing and editing whole thematic segments. Note that i would still want to remain the overall curator and proprietor of TagDoc at least until i can be sure that an alternative form of governance would sustainably protect and develop the premises of the project in a responsible way.

If you are a data user (or otherwise want to invest in OpenStreetMap beyond software development and paid mapping) and interested in contributing to financing writing for and curating this project, and thereby help making it more sustainable, or if you are an independent expert in the de facto meaning of tags in OpenStreetMap data with proven experience in analyzing tag use in the OSM database, English language writing skills and knowledge about the diversity of world wide geography and are interested in contributing to this project as an author, then you are welcome to get in touch with me about contributing.

About me, the proprietor and curator of TagDoc

So, some might ask, what qualifies me to run and curate a project like this?

The main incentive for me contemplating the problem of meaningful and reliable tagging documentation and ultimately starting TagDoc came from my work as a maintainer of OSM-Carto. Forming a qualified opinion on requests to add or change the rendering of certain features in the map under the goals of the project always requires a solid knowledge of the de facto use of the tags involved in OpenStreetMap. Same applies for developing features for my own OSM-Carto derived map style. Doing research on well more than a hundred such cases, involving a large bandwidth of tags, helped me acquire a broad knowledge background in practical use of tags in OpenStreetMap, a lot of practical experience in analyzing how tags are used in OpenStreetMap world wide and what the quirks and inconsistencies in that use are, as well as a good sense of the quality problems of the tagging documentation on the OpenStreetMap Wiki. Combine that with the experience with using OSM data on global scale, i have gained as part of my paid work and what i researched over the years regarding practical use of tags out of curiosity and as part of writing about OpenStreetMap in general on my blog and elsewhere, i have probably broader background knowledge about the de facto meaning of tags in OpenStreetMap than most involved in OpenStreetMap. But the key, ultimately, is combining that broad background about OpenStreetMap with a solid knowledge and experience with the geographic diversity of the planet. Most OpenStreetMap contributors in their on-the-ground mapping work acquire extensive knowledge of the geography of the area they map in but very few have a solid understanding of the full range of geographic diversity of Earth which OpenStreetMap aims to document.

Of course this is a valuable qualification for doing consulting work regarding use of OpenStreetMap data and it has helped me many times in giving competent advice to customers. But ultimately it is kind of dissatisfying to not being able to make this knowledge available systematically to everyone who would find it useful and value it because of the lack of an economic basis to do so. I know this is a problem i am not alone with. The economic ecosystem around OpenStreetMap is traditionally heavily biased towards technical work. Software development and paid mapping are the most valued and therefore the dominant paid activities around OpenStreetMap and intellectual work of all kind is seriously under-appreciated. It would be important for the future of OpenStreetMap to change that and having quite a bit of experience in doing consulting work at the edge between technical work (data processing) and intellectual work (map design) and being one of the most prolific public writers around OpenStreetMap i think i am in a good position to lobby a bit for such change and to attempt demonstrating the importance and value of intellectual work using TagDoc as an example.

And finally one other important thing that i think qualifies me as a curator of tagging documentation is that i am largely independent of OpenStreetMap economically so i can provide a honest and open assessment without being constrained by my own or others’ economic interests. While i do paid work with OpenStreetMap data, most of my income these days is based on working with satellite data and its visualization and therefore does not depend on OpenStreetMap.

Conclusion

Ultimately all of this is just an offer from my side. I hope it finds the resonance and support it needs to become sustainable in the way i outlined and this way becomes a useful and valued source of knowledge about OpenStreetMap data and the tags used in it to record local knowledge about the geography of Earth. If it does, but especially also if it does not, i hope my endeavor to start TagDoc incentivizes others to think and talk about the importance of creating and maintaining a reliable and competent documentation of the de facto meaning of tags in OpenStreetMap, not unduly affected by subjective ideas of their ought-to-be meaning and the best way to develop and maintain such. This kind of project, like any other intellectual work, can only thrive and be excellent if it receives intellectual resonance and critical feedback.

TagDoc logo

11 Comments

  1. To me, the main issue with TagDoc is the same as with this blog: too many words! Wiki pages on few registered tags seem to be even bigger than relevant pages on OSM wiki. With thousands of tags in use, it would be impossible to describe each of those with that level of detail. The idea reminds me of Nupedia: basically an expert-written curated Wikipedia. It did not work out.

    In 2011 I thought of a system with similar purpose, and envisioned it as a database just a tad more complex than an editor preset directory. It should’ve focused on de facto tagging, but with no bureaucracy and disputes. There’s even a draft for the schema (in Russian): https://wiki.openstreetmap.org/wiki/RU:Catalog/Zverik

    • I had thought about the question how much detail and how much verbosity is a good choice to start this project. And the tags i started with are – for the most part – tags of extraordinary significance and with a broad background that can be written extensively about. This is definitely not the case for all tags, not even for the top 100 or so. You can already see a gradient in the length of the documentation among the list of tags currently documented. In any case i distinctly tried to keep things relatively compact, in particular by not describing every regional particularity of tag use but only going across that in broad strokes. On the other hand i mean for this to function to share substantial knowledge about tagging in OpenStreetMap, not just to provide an executive summary optimized for a twitter compatible attention span. 😉

      The other thing i thought about in that regard is if it is better to start aiming for breadth or start aiming for depth – in other words: Try to provide rudimentary documentation on many tags or to do in depth analysis of a few and add more step by step. I went with the latter approach because even writing rudimentary documentation requires in depth knowledge if it is meant to be reliably and hold up to scrutiny. Hence it is not substantially less work just writing about the basics that writing a more substantial assessment.

      In light of this i would like to ask you back: Do you think the length of the tag pages on TagDoc is bloat – essentially words without substance? Or do you think they contain substantial information – but stuff that no one cares about enough for the project to gather sustainable interest?

      I am not sure if the comparison to Wikipedia/Nupedia is useful here – that an expert curated general digital encyclopedia does not work has to do with a lot of factors that do not apply here. TagDoc has a much more clearly defined scope and purpose. I have contemplated this well enough to be sure that it can work in the form i sketched provided it gets sufficient support.

      Anyway – the verbosity and style of the documentation is something i would be more than happy to discuss and adjust if there are convincing arguments for that. And what i have definitely on my todo list (subject to available time) is to make the analytic data and assessments/categorizations that are shown on the tag pages also available in an easy to handle machine readable form. That would – among other things – allow generating a more compact excerpt with very few words if there is need for that.

  2. Pingback: weeklyOSM 608 | weekly – semanario – hebdo – 週刊 – týdeník – Wochennotiz – 주간 – tygodnik

  3. Thank you Christoph for this proposal and summary about OSM wiki weaknesses.

    It’s true that many discussions has been held before, about how improving documentation maintenance.
    I respectably disagree about the lack of concrete solution so far. We have Data Items, we have taginfo, etc.
    Proposal process is also a documentation effort, despite heavy.

    Significant improvements was made with taginfo providing TagLists, updated directly from wiki pages scrapping, preventing users to write, then maintain many translated tags values tables.
    And so on…

    According to you, can’t the Data Items be part of the landscape?
    I don’t get how de facto meaning can be documented… with less human contribution.

    Tagging documentation sounds like a matter of chaining tools with different tasks than finding the more functional one.
    OSM Community has brought those different tools, which are waiting to be adapted for next challenges. It’s better than starting a new blank page.
    Is there any stuck in the development of those tools?
    The proposed side card on TagDoc wiki is pretty clear with relevant properties. Is there anything that prevent us to improve TagDescription or KeyDescription templates in the existing wiki?
    No problem to add corresponding properties to Data Item neither.

    To me, many things already exist. We need to move forward with those strengths and improve what needs to.

    Best regards

    • Thanks for the comment.

      Analysis of the shortcomings of the OSM wiki in general is a subject that is beyond the scope of this blog post. I presented my reasoning why i think the wiki is unable to provide a decent documentation of the de facto meaning of tags these days and why efforts to change that would most likely be in vain in my post above. I don’t want to discourage anyone to work on improving the wiki or the social interaction between people working on the wiki but anyone doing so should be clear that – as i have explained – given the influence the wiki has on mappers, data users and software developers alike you will always have a counter interests (and extensive ‘manpower’ behind those interests in the form of people with available time to pursue them) that like to integrate their views on how tagging ought to be in OSM into anything on the wiki that discusses tagging.

      Three other things i like to point out:

      • The problems on the wiki that have led to an increasing failure to provide a decent and competent documentation of the de facto meaning of tags over the years are – as discussed – social in nature. It is advisable not to succumb to the reflex of trying to address social problems with technical solutions. That will never work. The only possible source of competent documentation of the de facto meaning of tags is human knowledge and competence. And people with the knowledge and competence to produce that can neither be made nor attracted through technical means (but they can very well be discouraged through such of course).
      • The fact that i used mediawiki as the technical basis for TagDoc, partly because i hope it will make it easier for users and authors to work with it because it looks familiar in some way, should not lead you to underestimate the differences to the OSM wiki in just about any aspect you could look at. If TagDoc is successful that would likely lead to knowledge from TagDoc also influencing authors on the OSM wiki. But the idea that you could copy abstract ideas from TagDoc to the wiki without continuously depending on the the intellectual work that goes into TagDoc for that purpose is unrealistic in my eyes and disregards that the essence of TagDoc is writing down the expert knowledge of its authors. But as i said i would welcome any other projects that aim to provide better documentation of the de facto meaning of tags in some form – including such that happen on the OSM wiki. I would not regard such as a threat or competition but as potentially valuable inspiration and intellectual resonance.
      • The fact that i included quite a lot of analytic data in the first tag documentations i wrote should also not lead anyone to conclude that this is the essence or even just that this is an important part of TagDoc. The main reason why i include this analytic data is because this is data i use when researching the use of the tags i write about and to verify, improve and deepen my knowledge on them. It would seem wasteful not to include this data then.
  4. How will this work in practice? How can you determine the defacto use for a tag placed on 1 million instances, are you going to look at all of them? And how can you asses what they represent, if you do not know what is on the ground?
    In a free system used by tens of thousands of occasional mappers, you will always find misclassifications (“wrongly applied tags”), where is the cutoff where you switch from “is an outlier” to “is an additional application for the tag”?

    • I think i demonstrated that in the examples of tag documentation i provided. Evidently you cannot document the specifics of every individual application of a certain tag. But this is also not what data users are in need of. What is needed is a qualified summary assessment of how tags are used. As said – experience with the world wide geography is an important prerequisite for providing that. What i practically do is looking in detail at how the tags are used. Through statistics as well as through sampling. Through comparison with reference data like imagery as well as my own knowledge of the geography.

      The distinction between mapping errors (a mapper documenting a factually incorrect perception of reality) and a deliberate extension of the scope of use of a tag is typically quite easy to spot. But it is ultimately not all that relevant for TagDoc because i don’t want to pass judgement on right and wrong and just describe the tag use as it happens. And the distinction between exotic outliers and systematic unusual applications of a tag is made based on numbers of use, how many mappers follow this interpretation and how geographically widespread this application is. You can see the documentation of natural=glacier for some examples of that.

  5. It sounds like it might be interesting, but realistically I (and possibly many other people) am not going to wade through that much text. Could you summarise what you are trying to do in 100 words or fewer?

    • If you look at the TagDoc starting page – that has a short explanation what the project is about. That is about 250 words in total – but if you are only interested in what TagDoc is and not in the process of creating it then you can skip the last two paragraphs and end up with something barely more than a hundred words.

      But make no mistake: Reading that is not a replacement for reading the blog post. Your formulation (wade through that much text) indicates that most of what i write is just an obstacle in getting to the few words that are actually meaningful. That is not the case, not even for someone with your background knowledge of OSM. I don’t write lengthy texts because keeping my readers here longer is lucrative for me to sell ads. I write lengthy texts because i see and try to explain a broader background to the things i write about. Is this sometimes spending too many words on explaining things that are trivial to most readers? Certainly. But i equally get feedback that i do not spell out considerations clear enough for the readers to follow. Public writing for a broad audience is hard.

      I am pretty sure that if you are interested in the subject of tagging documentation in OpenStreetMap and the social dynamics around it that reading this blog post in full is worth the time. What would however be even better is if you’d collect your own thoughts about the matter and write them down for others to read – be that just a hundred words, or 2500 (like this blog post) or more.

  6. (that was 275 words).

    Why don’t you summarise it by saying “Tagdoc is a site where I have described how certain OSM tags are used” (14 words)?

    Regards,
    Andy

    • If this is how you view the project that is fine. If you want another short summary from an outside perspective – WeeklyOSM wrote: Christoph Hormann presented his ‘TagDoc Project’, which is intended to offer an alternative to the OSM Wiki with the focus on the de facto use of tags. Neither of these in my eyes well represents what TagDoc is about but both transport an interpretation i find insightful as a window into how others view this project.

      In my eyes what I wrote on the TagDoc starting page is about the most compact form in which i still feel i can transport what i think is the essence of the project. Any further reduction or summary should be provided by others if needed.

Leave a Reply to Andy Cancel reply

Required fields are marked *.



By submitting your comment you agree to the privacy policy and agree to the information you provide (except for the email address) to be published on this blog.