I have now prepared a sample gallery where you can find a few more samples showing how the style looks like at various zoom levels in various settings.
January 24, 2019
January 24, 2019
I have now prepared a sample gallery where you can find a few more samples showing how the style looks like at various zoom levels in various settings.
January 14, 2019
It took me somewhat longer this time but i have prepared my traditional yearly report on open data satellite image acquisition numbers – for Sentinel2 and Landsat – for the October 2017 to October 2018 time frame. See last year’s report for reference.
Will make it relatively short this time. Here are the overall recording volume numbers and their development over time:
You can see that the Sentinel-2 acquisition volume was significantly increased in early 2018 (as i already mentioned in my mid year report). This is the result of recording all major land masses at a ten day interval for each satellite now. This can also be seen in the spatial distribution visualizations linked to below.
In the recent months (i.e. the northern hemisphere winter), which are not yet included here but which you can find in the detailed recording patterns per orbital period and the daily recording numbers, the volume went down again because of the sun position recording limit meaning ESA is running Sentinel-2 under capacity so to speak – despite there being many options of making use of the additional free capacity by either:
The Landsat satellites are recording the same way as in previous years – with the reduced Antarctic recording plans for Landsat 8 that were started during the 2016-2017 season leading to the double minimum yearly curve you can observe. You can see this reflected in the spatial distribution visualizations shown below. The USGS has for Landsat 8 extended the night time acquisitions at high latitudes in summer (i.e. evening images) – as i also mentioned before. Off-nadir recording at the latitude limit were also performed similar to the previous year.
|year||day||night||day pixel coverage|
|2016||LS8, LS7||LS8||LS8, S2A|
|2017||LS8, LS7||LS8||LS8, S2A, S2B, S2 (both)|
|2018||LS8, LS7||LS8||LS8, S2A, S2B, S2 (both)|
December 21, 2018
This is a followup on my previous piece on differentiated rendering of woodland. Woodland is a fairly well defined part of vegetation to be rendered in maps. If you go beyond that things get more complicated.
In OSM-Carto it has been a traditional practice to use green color fills for vegetation related features – but not in a strict fashion. There are green fills being used for things that have nothing to do with vegetation and there were traditionally also vegetation related features that are rendered in other colors than green.
In a way green colors cover the largest part of available color space because green is the part of the color spectrum the human eye is most sensitive to. At the same time vegetation mapping in OpenStreetMap is traditionally very undifferentiated. While urban areas are mapped and rendered with fairly specific differentiation there is only a very small set of widely used characterization of natural vegetated areas. For agriculturally used areas this is a bit better and i previously discussed differentiating that.
Here is the set of green and other vegetation related fill colors in the wider sense that were so far used in the alternative-colors style.
These fill colors i used so far can be categorized roughly into
What i am concentrating on here are the physical types, in particular those that apply to natural or close-to-natural vegetation and the allotments color – which is, as you can see in the above illustration, kind of an anomaly.
I implemented the changes discussed here already some time ago but refrained from publishing them to allow OSM-Carto to pursue its own approach to color design independent of me.
The way natural vegetation is mapped in OpenStreetMap is based on the coarse physical characteristics so this is part of the physical vegetation types group as illustrated above. The wood and grass colors – which are dual use for both natural areas and human maintained areas – enclose the spectrum of vegetation types. For wood i explained this in detail before and with grass the situation is quite similar. There are a number of common tags for areas with grass growing on them in OpenStreetMap – the most common tags for those are landuse=meadow, landuse=grass and natural=grassland – those tags are not consistently used for distinctly different things however, it is all a fairly wild mixture, so for data users all you can say is they usually as a smallest common denominator indicate areas where grass grows. And accordingly they are shown in the same color.
The orchard color straddles the line between these endpoints (wood and grass) while the heath and scrub colors have so far been situated on different sides of the line. This can be seen illustrated in the image further down on the left – which is a simplified depiction of course since in reality color space is three dimensional.
This basic scheme had been used in the OpenStreetMap standard style for a long time. The main motivation was that the number of natural vegetation colors is rather small with only four colors for what ultimately covers a huge portion of the earth surface together. It therefore makes sense to make these four base classes well differentiable covering the full spectrum of moderate saturation green tones – with orchard in addition in the middle and the other green fill colors around.
However this also means that the large difference between heath and scrub color indicates a larger difference in meaning between the two than there actually is.
Here a bit of background information on the vegetation types in question.
natural=scrub and natural=heath in OpenStreetMap essentially covers different height ranges of permanent, woody vegetation. Both of these in OSM include what is commonly understood with the terms scrub and heath but is semantically much broader.
The height ranges are not precisely defined but for scrub this covers everything that is less tall than full grown trees are typically in the area in question down to the limit towards heath. The plant species of scrub can be either young or smallish specimen of the same trees that form full grown woodland or distinct scrub species.
The boundary between natural=scrub and natural=heath is in OSM commonly drawn at a height of about one meter. But this is not a very clear limit either. What is mapped as natural=heath typically consists of distinct dwarf scrub species. And this includes low growing woody plants in all kind of ecosystems and climate zones including ones that are not commonly called heath.
In practical mapping consistency of mapping scrubland and heathland is somewhat lower than woodland mapping in OSM. The most widespread incorrect uses are
In other words: Picking a completely non-fitting tag as kind of a compromise between other tags that partially apply is not advisable. You should always choose the tag characterizing the dominating plant type in those cases. Unfortunately there is currently no established tagging system for secondary vegetation types.
Like woodland both scrubland and heathland exists in deciduous and evergreen variants. But this is much less commonly tagged than for woodland and for heath such tagging is almost non-existent.
So overall heath, scrub and wood form kind of a linear sequence and it makes sense to acknowledge this with the choice of color as a linear sequence in color space as well. To allow reasonably good differentiation i brightened the heath color somewhat and placed the scrub color halfway towards wood. Some more tuning of the other colors around and the colors worked out quite nicely.
As a result contrast between scrub and heath as well as towards grass and wood colors is reduced but they are still reasonably good to distinguish.
and in settings like here this color scheme better brings out the difference between agriculturally used areas and not agriculturally used areas.
Changing the base color of scrub also required adjusting the pattern symbol color for the different leaf_cycle variants. Here is how this looks like:
An advantage of this is that the neutral symbol colors for no leaf_cycle looks a bit more balanced than previously.
As already mentioned above the allotment colors was kind of an anomaly in the alternative-colors style palette after i had changed the farmland color.
Semantically allotments is an interesting concept. Together with farmland, orchard and agricultural meadows it represents a third type of broadly agricultural use areas which is – like farmland – more functionally than physically defined. Allotments are small scale heterogeneous farming areas where plants are grown for personal use or sometimes small scale sale. There is a broad range of practical implementations of this world wide and a broad range of produce that is grown in them. And there is not always a well defined distinction from farmland and orchard. The two most common variants are the allotments of Western Europe with typically very small individual plots and only footways in between and the larger scale allotments of Russia and parts of Eastern Europe where the individual plots are larger and there are usually access roads to them.
The color has to work in all of these different settings and needs to suitably represent allotments in all these different variants and environments. Obviously the choice of color is tightly connected to the change with scrub and heath – demonstrating again that a balanced color scheme requires looking at the whole picture and not just individual colors.
Here some examples how this looks like
Here is the whole updated area color scheme of the alternative-colors style.
As usual you can try out the changes discussed here yourself – the style can be found on github.
December 20, 2018
In Carl Sagan’s science fiction novel Contact, which likely most of the younger readers here have not read and which today probably seems a bit out of time with its cold war setting, there is a remarkable story about half way through the book – not a narration of something that actually happened in the book but in a way a picture painted with words:
[…] And now along comes an invitation. As Xi said. Fancy, elegant. They have sent us an engraved card and an empty droshky. We are to send five villagers and the droshky will carry them to – who knows? – Warsaw. Or Moscow. Maybe even Paris. Of course some are tempted to go. There will always be people who are flattered by the invitation, or who think it is a way to escape our shanny village.
And what do you think will happen when we get there? Do you think the Grand Duke will have us to dinner? Will the President of the Academy ask us interesting questions about daily life in our filthy shtetl? Do you imagine the Russian Orthodox Metropolitan will engage us in learned discourse on camparative religion?
No, Arroway. We will gawk at the big city, and they will laugh at us behind their hands. They will exhibit us to the curious. The more backward we are the better they’ll feel, the more reassured they’ll be.
It’s a quota system. Every few centuries, five of us get to spend a weekend on Vega. Have pity on the provincials, and make sure they know who their betters are.
I do not intend to equate the picture drawn here with the SotM scholarships but this is something i always remember when i read things about and think about the SotM scholarships.
As i indicated in my general post on SotM earlier the official communication on the SotM scholarship program is fairly sparse. There was a call for scholarship applications in January combined with some explanation on how applicants are chosen which creates more questions in my opinion than it answers – i will get to that later.
Applicants were – like in previous years – required to use Google services to submit their application which clashes both with the OSMF FOSS policy and data protection requirements.
This is all that was publicly known about the scholarships until the conference when in the conference booklet the list of scholars and the list of members of the selection committee was published. The list of scholars is now also on the wiki. For comparison we also have the list of scholars from 2017 but nothing from the previous years.
Before i go into more details about the selection and the selection process – not providing information on who is involved in selection and rating when you ask people to apply for something, no matter if scholarships or talks, is pretty much incompatible with the idea of a transparent organization. Doing so not only afterwards but already when calling for people to apply is a matter of simple courtesy IMO. This is in particular the case if the selection process for the committee is also intransparent and undocumented.
We know from Christine’s more recent report about SotM-WG work that there were over 200 applications for scholarship – and this is still the only publicly available information on the applications.
We know who was ultimately selected – this was published in the conference program booklet and later also on the wiki. Since we have no further information on the applicants evaluating that is difficult. I therefore will not discuss the merit and qualification of the scholars selected individually. There is simply no basis for that without knowing about the other applicants. But as you will see there are some hints regarding the selection process in there.
Half of the scholars were men, the other half were women, the geographic distribution was as follows:
You can clearly see the aim to accomplish an impression of overall geographic and men/women balance at the first glance in that – as well as a broad age distribution (although this might be purely incidental). This is interesting because the documented process does not in any way describe ensemble selection, it purely describes the independent assessment of individual applications which is extremely improbable to result in an ensemble like this – no matter how the more than 200 applications are distributed. So you can conclude that the individual application rating is – despite being the only part of the process that is documented – ultimately of rather limited influence on the overall selection and the ensemble optimization regarding some idea of balance (which definitely includes gender and location – but possibly also other criteria) is much more significant but at the same time completely undocumented and intransparent.
If you look at the geographic distribution of the scholars more closely you can also see that geographic balance of the ensemble is less broad than it might seem at the first glance. Yes, the most populated continents are all present but what is clearly missing are scholars from:
The Middle East, Central Asia and Northern Africa were by the way already missing in 2017.
Note an English language bias is a deliberate choice in the scholarship program – which however inevitably also leads to a certain cultural bias. Combining a deliberate language selectivity with an ensemble optimization for coarse grained geographic diversity leads to both the occurrence of gaps like the ones mentioned at a more fine grained geographic scale than the one optimized for and to a huge inbalance in chances of applicants with similar inherent qualifications depending on where they are from. And that qualified applications from certain regions were rare among the more than 200 is not a valid explanation here because the selection already starts with the overall presentation of the scholarship program.
The other aspect we can look at based on the small amount of information we have is the composition of the selection committee. This is listed in the program booklet and is as follows for 2018 – name with country and the various relevant affiliations:
Doing a bit of counting we have:
Now i know that there are many people who see no problem in this and think the various affiliations are just testimony to the qualifications of these people. I strongly disagree. Let’s start with former scholars. I think it is a great idea to recruit former scholars of the conference for selection in later years. Having been a scholar certainly can give a useful perspective what important qualifications are for a scholarship SotM visit to be useful. A bit of care needs to be taken to avoid this leading to self replication of certain patterns of selection bias but this is manageable as long as former scholars are not dominant overall. But this in my eyes absolutely requires a rule that anyone who has been on the scholarship selection committee is for that year and for the future disqualified from applying for a scholarship. Everything else is inappropriate in my opinion. It is all right if the host country has a relatively strong representation but three members (of 13) from the UK IMO really stretches the limits of what can be considered a geographically and culturally diverse committee.
The most important thing in my eyes however is the independence of the scholarship selection from the sponsoring of the conference – both on the giving and the receiving side. Even the appearance that sponsors have any influence on the selection of scholars of the conference would in my eyes completely delegitimize and undermine the whole program. This should rule out inclusion of people who work for potential sponsors and people involved with sponsorship acquisition.
I also think having people who are both involved in the program committee and scholarship selection is a bad idea. There is inevitably quite a lot of overlap between the candidates of the two but at the same time the criteria for the selection are inevitably different. But a committee member who is evaluating a talk submission from a person whose scholarship application he/she has just reviewed (or the other way round) cannot simply delete the opinion formed on the person in question based on different selection criteria. It is inevitable in such a situation that the criteria for evaluation get mixed and that is undesirable for a fair selection.
Overall i have two main points of critique for the SotM scholarship program:
The first is the intransparency, the lack of documentation and the lack of auditability of the selection process. This combined with the clear indication that ensemble optimization and not individual qualification assessment is the main basis for the selection and with the various problem in the scholarship selection committee composition leads me to conclude that a significant reform in the process would be important independent of my second point.
My second point is a more fundamental critique of the whole idea of a scholarship program in the current form. This connects to the quote i started this blog post with. The question i am asking myself – and which i think everyone should ask is what is actually the purpose of the scholarship program. Yes, superficially this is to allow people to visit the conference who otherwise due to limited financial means could not. But i don’t think it is sufficient to leave it at that. Why do we think this is something good to do – and more importantly: Why is it better to spend the money this way than for other purposes.
The whole idea of shipping people from their shabby villages to the big and shiny global OSM conference is highly problematic. Bringing people to where “OpenStreetMap is happening” at the moment will mainly perpetualize the fact that OpenStreetMap is a project of a small privileged world in Europe and North America. If we really want to make OpenStreetMap more global we need to invest in bringing the idea of OpenStreetMap out into the world without colonizing it with our cultural values we have put on top of the basic idea of OpenStreetMap to create a map by the people for the people. A scholar visiting SotM will primarily learn how the privileged and rich do OpenStreetMap and will likely bring that idea of OpenStreetMap back home with them which can be counterproductive for the local community developing their own OSM identity.
Yes, this picture is a bit one sided obviously but it is an important counterpoint to the narrative of altruistically allowing people to visit SotM who otherwise could not. The idea that the scholarships are primarily for the benefit of the scholars and the local communities where they come from is nonsense. They are at least as much for us to feel better in our comfortable lives in Europe and North America because we get a bit of superficial diversity without any substantial endangerment of the status quo because we ship the people here where they have to adjust to our culture and not the other way round and we ship them back once they have served their purpose.
So whatever opinion you develop regarding SotM scholarships – don’t make the mistake of taking the simplified view of this being a simple altruistic endeavor to help people with limited financial means. I don’t say scholarships cannot make sense under any conditions but i think so far no one has presented a well balanced and self critical concept how SotM scholarships can work in a way that is morally sound and how a scholarship program needs to look like to satisfy this.
December 9, 2018
It is December again and that means – like last year – time for OSMF board elections. And like last year i urge all OSMF members to vote and to vote responsibly in the interest of the OSM community. If you are eligible to vote you will have received a mail with a voting link. If you have not although you think you are eligible you should contact the MWG.
Like last year there are two seats (from seven in total) to be elected but unlike last year none of the existing board members whose seats are up for election is re-running. So while last year we had essentially only one truly contested seat with Paul re-running without significant opposition (so it was nearly sure he would be re-elected) this year the race is open for both seats.
Last year i mentioned the elections were pretty significant for the direction of the OSMF and the results – while pretty narrow in the end – ultimately pointed to a direction towards less influence of local hobby mappers and a larger influence of external organization with the number of HOT voting members on the OSMF board raising from two to three and the number of OSMF board members doing paid work related to OSM or working for an organizational OSM stakeholder staying at a constant high of five of seven members.
As a result of that it was visible during the last year that the OSMF board was increasingly struggling to actually make decisions in the interest of the local hobby mapper community. I think you can observe an increasing divide between parts of the board and the mapper community. Some of the board members seem much more interested in working together with their peers on the level of organizational stakeholders and hardly engage in eye level communication with the hobby mapper community any more and as a result are not aware of the matters normal mappers all over the world care about and that are of importance for the future of the project.
At the same time it has become clear over the recent months that the previous trend of decreasing cultural and geographic diversity in the OSMF membership has continued – partly driven by initiatives of corporations urging their employees to sign up as OSMF members. Nearly half of the OSMF members eligible to vote are now from the United States which have more than twice the number of OSMF members per active mappers (including SEO Spammers) compared to the best represented countries in continental Europe. In a way you could say the OSMF membership structure seems to align itself to the composition of the board (which is kind of odd since you’d normally expect it to happen the other way round). There were some promising initiatives to recruit more hobby mapper members in non-English speaking countries this year (which i wrote about) but overall the effect of this was not able to reverse the overall trend.
With that as background this year’s elections are less of a crossroads decision than last year where the major direction was decided. That does not mean this year’s elections are not significant. And since this time two seats are up for a fully open election there is also in principle the option to revise last year’s decision for overall direction. I have heard some people saying all of the candidates this year are equally qualified. Depending on your idea of qualification that might be true but there are huge differences between what the candidates represent – maybe even larger than last year.
From my perspective the elections this year will decide if the divide between the OSMF board and the mapper community that has become increasingly visible during the last year will widen, maybe to the point of a full breakup or if the board is able to change direction and steer back towards the OSM community in a meaningful way. And the ability of the candidates to accomplish that seems to differ a lot. Note i am not talking about finding a compromise here with the local hobby mappers bending to satisfy the interests of organizational stakeholders. This – with the board in many aspects, in particular on the matter of the organized editing policy, representing the organizational interests towards the mapper community and the working groups getting worn down in the middle – is essentially what we have seen last year and what has led to the widening divide i observe.
So OSMF members: Choose wisely. There is a quite a lot of material available on the elections:
November 23, 2018
I have mentioned several times already that i wanted to write a blog post on verifiability in OpenStreetMap. The need for that from my perspective grew over the last 1-2 years as it became increasingly common in discussion in the OSM community that people would either flatly reject verifiability as a principle or try to weasel around it with flimsy arguments.
OpenStreetMap was founded and became successful based on the idea of collecting local geographic knowledge of the world and collecting this knowledge through the local people participating in OpenStreetMap and sharing their local geographic knowledge in a common database. The fairly anarchic form in which this is happening with a lot of freedom for the mappers how to document their local geography and very few firm rules turned out to be very successful to be able to attract people to participate and to allow representing the world wide geography in its diversity.
The one key rule that is holding all of this together is the verifiability principle. Verifiability is the most important inner rule of OpenStreetMap (as opposed to the outer rules like the legality of sources used and information entered) but it is also the most frequently misunderstood one. Verifiability is the way in which OSM ensures the cohesion of the OSM community both in present and in the future. Only through the verifiability principle we can ensure that new mappers coming to OSM will have a usable starting point for mapping their local environment no matter where they are, what they want to map and what their personal and cultural background is.
Verifiability is not to be confused with accuracy of mapping. It is not the endpoint ideal of a scale from very inaccurate to very precise. Verifiability is a criterion for the nature of the statements we record in the database. Only verifiable statements can objectively be characterized as being accurate to a certain degree.
Verifiability is also not about on-the-ground mapping vs. armchair mapping. It does not require all data entered to be actually verified locally on the ground, it just requires the practical possibility to do this. Assessing if certain information is verifiable is much more difficult when doing armchair mapping than when mapping on the ground and therefore armchair mapping is a more difficult and more demanding task regarding the abilities and competence of the mapper but you can in principle map a lot of verifiable information from images, sometimes better than when you are on the ground.
Verifiability is also not to be confused with Verificationism, which rejects non-verifiable statements as not meaningful. OpenStreetMap does not pass judgement on the value of non-verifiable data by excluding it from its scope. It just says this kind of data the project cannot include in its database because it cannot be maintained under the project’s paradigm. The viability of OpenStreetMap as a project depends on it limiting its scope to verifiable statements. And in practical application (i.e. when resolving conflicts) it is also often better to regard verifiability as meaning falsifiability.
In a nutshell verifiability means that for any statement in the OpenStreetMap database a different mapper needs to be able to objectively determine and demonstrate if the statement is true or false even without the same sources used by the original mapper. Those who based on a philosophy of universal relativism want to say that in most cases statements are neither clearly true or false have not understood the fundamental idea behind verifiability and confuse it with precision of mapping while assuming a priori that everything is verifiable. Verifiability is about the fundamental possibility of objectively assessing the truthfulness of a statement based on the observable geographic reality.
This contrasts with the Wikipedia project which takes a very different approach to recording information. Wikipedia has its own verifiability principle but this has its own meaning completely different from that of OSM. Verifiability in Wikipedia means statements need to be socially accepted to be true. This is determined based on a fairly traditional view of the reputation of sources. Such a system of reputation is obviously very culture specific so Wikipedia tries to ensure social cohesion in its community by allowing different contradicting statements and beliefs to be recorded (in particular of course in different language projects but also to some extent within a single language). Still conflicts between different beliefs and viewpoints, struggles for dominance between different political or social groups and contradicting statements in the different language versions representing different culture specific views of the world are a common occurrence and a defining element of the project.
On a very fundamental level this difference between OpenStreetMap and Wikipedia kind of mirrors the difference between natural sciences and social sciences. This however does not mean that OpenStreetMap can only record physical geography features. A huge number of cultural geography elements are empirically verifiable.
Still to many people with a social sciences or Wikipedia background the verifiability principle is very inconvenient. There is a broad desire of people to record statements in OpenStreetMap that are part of their perception of the geography even if they differ fundamentally from the perception of others and are not practically verifiable.
One of the ideas i often hear in this context is that verifiability is an old fashioned conservative relic that prevents progress – this is kind of ironic because the idea of verifiability directly stems from the values of enlightenment. Fittingly some of the specific non-verifiable mapping ideas communicated seem to have an underlying Counter-Enlightenment or Romantic philosophy.
In addition pressure to include non-verifiable data in the OpenStreetMap database also comes from people who see OpenStreetMap less as a collection of local knowledge and more as a collection of useful and suitably preprocessed cartographic data – ignoring the fact that the success of OpenStreetMap is largely due to specifically not taking this approach. The desire to include data perceived to be useful independent of its verifiability and origin is also pretty widespread in the OSM community. Such desires are usually fairly short sighted and self absorbed. Usefulness of information is by definition subjective (something is useful for a specific person in a specific situation) and relative (bow and arrow might be very useful as weapons but will likely become much less useful once you have access to a gun). An OpenStreetMap that replaces verifiability with usefulness would soon become obsolete because usefulness in contrast to verifiability is not a stable characteristic.
And what all the opponents of verifiability seem to ignore is that giving in to their desires would create huge problems for the social cohesion of the OpenStreetMap project and its ability to continue working towards its goal to create a crowd sourced database of the local knowledge of the world geography in all its diversity. The objectively observable geographic reality as the basis of all data in OpenStreetMap is the fundamental approach through which the project connects very different people from all over the world, many of whom could outside of OpenStreetMap hardly communicate with each other, to cooperate and share their local geographic knowledge. Without this as a connecting principle OpenStreetMap would not function and trying to adopt a verifiability a la Wikipedia instead would not only import all of Wikipedia’s problems, in particular the constant struggle for opinion leadership, it would also not be suitable in the end for the kind of information recorded in OpenStreetMap and the way mappers work in the project.
As already hinted above we practically already have a lot of non-verifiable data in OpenStreetMap. So far this mostly takes the form of an inner fork – there are mappers who actively map and maintain it but the vast majority of the mapper community practically ignores this. There are however also places where non-verifiable statements interfere with normal mapping in OSM – in particular by people trying to re-shape existing verifiable tags with additional non-verifiable meanings.
Non-verifiable data can broadly be split into two categories: Non-verifiable tags and non-verifiable geometries. The most widespread type of non-verifiable geometries are abstract polygon drawings. The traditional approach in OSM to map two dimensional features that verifiably exist but have no verifiable extent is to map them with a node. The node location for a feature of some localizability will usually converge to a verifiable location even if the variance of individual placements of such a node can be very high. But with the argument of practical usefulness or based on a dogmatic belief that every two dimensional entity should be mapped with a polygon in OSM quite a few mappers prefer to sketch a polygon in such cases without a verifiable basis for its geometry.
Among non-verifiable tags the most widespread are non-verifiable classifications. Something like i view feature X to be of class A but i can’t really tell what A actually means in a general, abstract form so others would be able to verify my classification. One of the most widespread tags of this type is the tracktype tag which has been used since the very early days of OSM. The psychological background of this kind of tagging is usually that people want to develop a simple one-dimensional classification system for a complex multi-dimensional reality but are either not able or not willing to actually think this through into a consistent and practically verifiable definition.
The other type of non-verifiable tag that in particular more recently became quite popular is computable information. This means statements that can be derived from either other data in the OSM database or from outside data but that cannot be practically verified by mappers without performing the computation in question. Initiatives for adding such data are always based on the usefulness argument. And even though it is quite evident that adding such data to the OpenStreetMap database does not make much sense – both because of the verifiability principle and because of the problem of data maintenance – the practical desire to have certain computable information in the database can be very strong.
What would help to reduce this conflict in OpenStreetMap between those who value the verifiability principle and those who see this as an inconvenient obstacle to adding useful data would be to start a separate database project to record such non-verifiable add-on data for OpenStreetMap. But although this is technically quite feasible the need to build a separate volunteer community for this creates a significant hurdle. One of the motives for people pushing for non-verifiable data in OSM is to get the existing mapping community to create and maintain this data.
The ultimate question is of course if verifiability will prevail in the future of OpenStreetMap in the light of all of this? I don’t know. It depends on if the mapper community stands behind this principle or not. What i do know and i tried to explain above is that OpenStreetMap has no long term future without the verifiability principle as a practically relevant rule (i.e. one that is not only there on paper but one the community actually adheres to in mapping). So it would be essential for OpenStreetMap’s future to communicate clearly to every new mapper that they being welcome in the project is contingent to acceptance and appreciation of the verifiability principle as one of the project’s core values. I think this has been neglected during the past years and this needs to be corrected to ensure the future viability of the project.
November 11, 2018
At the last OSM hack weekend i worked on something that has been a sore spot of many map styles for quite some time. The problem is pattern images. I have written on the subject of designing area patterns for use in maps frequently in the past but this always was from a design perspective. From the technical side the story is the following:
In the beginning area fill patterns meant raster images because renderers did not support anything else. At some point Mapnik started offering support for SVG pattern images. This was and still is severely limited in terms of support of the SVG feature set which requires sanitizing SVG files to only use supported features. But the real problem is that Mapnik SVG rendering still seems fundamentally broken under certain conditions (see here and here). So the only safe way to get correct and consistent rendering of patterns in maps for screen use is to use PNG pattern images.
At the same time when you want to render maps for printing Mapnik does not scale PNG based patterns when you change the rendering resolution leading to incorrect pattern scaling relative to other elements in the map. So for printing maps you want SVG pattern images.
To solve this dilemma i wrote a script that
This script is for CartoCSS+Mapnik map styles but it can of course also be adapted for other frameworks.
By the way client side rendered maps with continuous zooming have their own specific troubles with using area patterns – which is one of the reasons why you rarely see patterns being used in those maps.
Most of the planning and script writing work i did at the hack weekend but preparing some of the patterns into a form suitable for automated processing took a bit more time. This was in particular a problem for the wetland patterns – which use multiple colors and are generated using raster processing as well as the rock pattern – which uses the jsdotpattern outline feature which uses white strokes for drawing a casing around geometries that needed a lot of processing to generate a visually identical plain color geometry. There are also a number of legacy patterns, in particular various hatchings, that i did not yet look at.
Here is the current set of patterns in the alternative-colors style in the form of the previews generated by the script each tiled and cropped to 128×128 pixel size.
November 3, 2018
Earlier this week i gave a talk in Dresden for the local section of the DGfK about OpenStreetMap cartography (in German). Since i got several requests for the slides of this talk i am publishing them here.
The talk is primarily about open community maps developed within the project in contrast to commercial OSM based maps but in preparation for the talk i also noticed that in terms of cartographic innovation commercial OSM maps are actually not really that meaningful in either past or present. This is kind of disturbing considering that development of digital cartography outside of OpenStreetMap does not stand still of course. So while OpenStreetMap pioneered many techniques of digital automated rule based cartography in the past at present it seems that the commercial OSM data users are satisfied with focusing on the technological side and show very little interest in actual cartographic progress.
And this despite the fact that there are actually quite a lot of aspects of the cartography of digital interactive maps which above the level of testing random ideas and playing around on a technical level have hardly ever been analyzed and discussed.
November 1, 2018
On December 15 this year’s elections for the OSMF board are scheduled and in contrast to last year i want to cover this matter here earlier with a call for all local craft mappers and other hobbyists active in the project – those who are the heart and soul of the project, to become members of the OSMF to be able to participate in the elections which you can only do until November 15 (30 days before the elections).
I only wrote this call in German because my main motivation for this is that the OSMF is currently fairly bad – and getting worse – at representing the global OSM community and its interests both in its members and in the board. Part of the reason for this lack of proper representation is the dominance of the English language in the political discourse around the OSMF while the overwhelming majority of mappers in the project are not native English speakers. So i am publishing this call in German – which is the only language i am a able to write properly other than English – with the explicit encouragement to local mappers all around the world to write your own calls in your native language to get your fellow mappers to voice their interests in the OSMF by becoming a member and participate in votes. You are free to use and translate my German explanations but you can of course also present your own ideas and reasons. If you publish such a call in a different language i invite you to let me know in the comments so i can add a link.
Local mappers – take ownership of your local map. And local mappers everywhere: Take ownership of the OSMF together to make sure the OSMF represents your interests and the spirit of the OpenStreetMap project.
Update: We now have a call in French – which is nice since French mappers are currently particularly underrepresented in the OSMF.
October 18, 2018
Back in July 2017 when the first Sentinel-3 Level-2 data was released i did not write a detailed report like i did for the Level-1 data because i wanted to wait for the rest of the products – which are significantly more interesting at least for land areas – to give a more complete picture and allow for an overall assessment. It took more than another year for this to happen a few days ago – with a total time from satellite launch to the full data release of nearly 32 months or more than 1/3 of the satellite lifetime.
Here is the updated timeline for Sentinel data product publications:
None the less here a quick introduction into this newly available data. You should probably read what i wrote on the Level-1 data first (part 1, part 2 and part 3) – much of which applies in analogy to the Level-2 data as well.
Data access to the land products works essentially as already described for the Level 1 data. Currently this still works through a separate provisional download portal which requires no registration but it will likely move to the general Sentinel data hub where registration is required.
The water data products are however not available through the ESA infrastructure – they can be accessed through EUMETSAT. They use the same download software but in an older version with various flaws. As you can see the package footprint display is broken and you can also apparently not specify a spatial search area.
For the new synergy data products there are no preview images available at the moment and the previews of the other Level 2 products are in parts of rather limited use.
Both the ESA and the EUMETSAT interface can be scripted so you don’t have to use the UI.
Here is the full table of now publicly available data Sentinel-3A data products. The first three products are the ones i already discussed. The last four are the newly released ones. These are so called SYNERGY products which combine both OLCI and SLSTR data.
The whole thing is complicated not only because of the two separate sensors but also because in addition you have separate products for land and water. This idea is something they apparently copied from how MODIS data products have been planned with separate data centers being responsible for different fields of application and the corresponding products. But they really messed this up because while MODIS land products include inland waterbodies and near coastal and polar region water areas the Sentinel-3 land products just flatly leave out anything not strictly land and the water products leave out anything not strictly water. That leaves out in particular clouds as well as sea ice in both land and water products. Needless to say what is land and what is water is obviously not actually a very reliable definition.
The whole idea of tightly masking any kind of area not considered applicable for the processing in question is pretty bad. The sensible thing to do is to flag these areas as such in a separate quality data set but to process them none the less. This way you leave the decision how to use the data open for the user. Such an approach is also established practice in higher level image data products elsewhere. Masking the data instead with an arbitrary and low quality mask imposes mandatory mediocrity onto all data users.
This whole product design concept communicates a highly questionable, arrogant and narrow minded culture and attitude that in many ways sharply contrasts with the basic ideas of the Sentinel program. While openly making available this data is supposed to enable and support new innovative applications those people who designed these products quite clearly think they know better than any of the potential data users how the data can be used.
There is no sensible reason to not include inland waterbodies and coastal and sea ice covered waters in the land data products, this does not even reduce the data volume in a meaningful way. The very idea that an image pixel can only be either water or land is in itself pretty absurd.
This to keep in mind when looking at the available data products – most of the land products have pixels decreed to be non-land zeroed out and the water products have the non-water pixels zeroed out. Some data in addition has cloud masking applied. There are a few exceptions from this division though showing that this is not really a technical limitation but an arbitrary choice made.
The first data product i am going to discuss is the Level 2 OLCI land product which is available in the full and the reduced resolution variant. Like the Level 1 products the full resolution data is distributed in 3 minute sections of the recordings along the orbital path while the reduced resolution data is packaged for a whole dayside part of the orbit based on the fairly narrow sun angle recording limits of the OLCI instrument. In the original product grid this means a size of usually 4865×4091 pixel for the full resolution data and about 1217×15000 pixel for a half orbit. Package sizes for the full resolution version vary quite strongly due to the nodata masking (see below) between less than 100MB to about 200MB. The reduced resolution packages are usually around 200MB in size.
The land surface reflectance product is actually not what the product description seems to promise because it contains reflectance values only for two of the OLCI spectral channels (Oa10 and Oa17, that is red and NIR) plus a few aggregated values calculated from these and other spectral channels. Here is a visualization of them:
As you can see both water areas and clouds are masked – except for the water vapor data where the water is not masked. In addition also sea ice and snow/land ice are not included.
So this data product is kind of a cheat – in fact it is a really strange combination of some vegetation and reflectance data (for vegetated land areas only) and atmosphere related data for land and water thrown together. As a specialized vegetation data product it can be pretty useful but as the only OLCI Level 2 data product it is a complete failure. I mean you have the OLCI instrument with 21 spectral channels and the only data above the raw Level 1 radiances you can get is this. So far the only real land surface reflectance data you can get is in the synergy products (see below) – which has its own limitations.
The water masking at least seems to be based on actual water classification and not from an arbitrary and inaccurate pre-defined land water mask as in the temperature and synergy products.
As a water product this is – as mentioned – only available from the EUMETSAT download infrastructure and is only provided for one year after recording there.
In contrast to the OLCI land surface reflectance products the water surface reflectance product is fairly solid and complete. It contains atmosphere and view direction compensated water surface reflectances for all the OLCI spectral channels. Packaging and grid sizes are the same as with the other OLCI products and like the land products packages size for the full resolution packages varies with up to 500-600MB while the reduced resolution packages are about 300-400MB typically.
Here an example for a visual color rendering from a full resolution package.
As you can see not only the land is masked but also clouds and sun glint on the right side. This is not very reliable but rather a fairly conservative masking. And as indicated in the introduction sea ice is not included either.
Quality of the data processing seems pretty decent. There are some problems with thin clouds and their shadows and the results degrade a bit towards the sun glint areas but overall this is quite competitive in quality.
This SLSTR based product contains an estimate of the land surface temperature based on the thermal infrared recordings and as such is available both for the day and night side of the orbit. As a land product it is only available from the ESA download infrastructure. The Level 1 thermal infrared data is already quantified in temperature units (Kelvin) but it is the raw, unprocessed data. I showed illustrations based on the Level 1 data previously. Here an example for comparison how this differs from the Level 2 data – with both the land and water versions for comparison, in an Antarctic setting with very little open water at this time of year.
The Level 2 land surface temperature data is made available in a form quite similar to the Level 1 data with separate files for the geolocation data and other supplementary information. One particularity of the Level 2 temperature data (both land and water) is that the near real time version is provided in the 3 minute orbit segment tiling like the Level 1 data while the not time critical data is made available for the whole orbit in one package. In the coordinate system that is a long strip of data about 1500×40394 pixels which wraps once around the globe. A 3 minute packages is typically about 60-70MB in size, a full orbit package about 1.8GB.
As you can see the land masking is not based on some actual land/water detection but from a fixed ocean mask. And apart from the already mentioned sea ice this mask excludes the Antarctic ice shelves – so if you happen to be interested in temperatures there you are out of luck. And to demonstrate that the world does not implode when you calculate land surface temperatures on a water area – inland waterbodies including the Caspian Sea are included.
No cloud masking seems to be performed in this data product but there seems to be no data generated for very cold areas of the Antarctic interior as well as very hot areas above 345K (visible in the Iran image above).
This is in principle more or less the water version of the previous product but practically it is very different in a number of ways.
In contrast to the other data products all data is in one file – which is none the less zipped into the usual package. The geolocation data and other supplemental information is all in that file. This is not a big deal but the question is of course why this is inconsistent with the other products.
Also clouds are generally masked in the sea surface temperature data – though relatively sparsely. And as already indicated sea ice is not included either – though this does not seem to be masked based on actual sea ice detection in the data but from some unknown external data source.
Like with the land temperature data the near real time version comes in orbit segment tiling while the not time critical data is in whole orbit packages. Package size is on average smaller than for land data with about 20MB for a typical 3 minute package and about 600MB for a full orbit package.
So far the products described were those already released in 2017. The now newly released products are so called synergy products combining the SLSTR and OLCI data. The land surface reflectance synergy product is the only Level 2 data product for land containing most of the spectral channels so it is in a way the main Level 2 land data product. This is distributed in the normal 3 minute orbit sections with data from both the OLCI and the SLSTR reflective spectral channels being provided in the OLCI 300m grid (that is usually 4865×4091 pixel). These packages vary strongly in size between about 200MB and 1GB. Here how this looks like in the visible light spectral channels.
A number of things can be observed from this. Clouds are masked and snow and ice seem to be prone to be misinterpreted as clouds and masked as well. Water areas are also not included (including inland water).
As said the SLSTR spectral bands are also included. And since these are re-processed into the OLCI grid they don’t suffer from the SLSTR duplicate pixel geolocation data errors i discussed previously.
As you can see however the position of the oblique view is now completely skewed – which is further emphasized by the water and cloud masking not matching the data. So i really can’t say i am in any way confident that these people have any idea what they are doing here w.r.t. the geolocation data.
There are also some other serious quality problems. The most obvious one is a usually fairly visible step in the pixel values at the edge of the oblique SLSTR view (see my first Sentinel-3 data review for details). This is because the atmosphere and view direction compensation differs depending on the data available. The claim is that using the oblique SLSTR data makes this compensation more accurate. Here the relevant quote from the product notice on that:
As the aerosol retrieval is supposed to be more accurate on “dual-view” area, a transition between “nadir only” and “dual view” area can be observed in some SYN L2 products. In a majority of products, this transition is visible through sharp differences in the Aerosol Optical thickness Values.
If that is the case the question is of course why they don’t compensate for the systematic offset leading to the visible edge.
Another serious problem is that the nodata mask (that is the set of pixels marked with a dedicated nodata value – in this case -10000) differs between spectral bands. Apparently not all of the nodata pixels are either water areas or clouds, some are also seemingly set as invalid because atmosphere compensation produced extreme values. This occurs in particular with very bright areas (snow) and very dark areas and this invalid value masking seems to be done separately for each spectral band.
Here an example from an area where this is particularly visible. First the version with all pixels with a nodata value in any of the spectral bands set to black
And for comparison the version with nodata pixels set to zero independently for every band and then assembled – so the black pixels are only those which are nodata pixels in all of the spectral bands.
The yellow spots are what seems to be oscillations in the atmosphere model resulting in invalid values in some of the spectral bands (in particular the blue ones) leading to ghost nodata spots where neither clouds nor water areas are the reason and surface colors are neither extremely bright or dark. And even around these spots in the first variant you can see a yellow halo of severely distorted surface reflectance values meaning that the masking does not actually exclude all pixels where the atmsophere compentation fails to work correctly while it does exclude areas where reasonably accurate values exist (like very bright snow areas).
What you also can see from this sample area is that water masking is fairly arbitrary. The common mask for all spectral bands seems to be in this area both very inaccurate and with a systematic offset.
So even beyond the principal problem of the water masking there are also a number of serious other issues that make this data product much less useful than it could be.
The other newly released data products are vegetation products consisting of four aggregate bands combined from several OLCI and SLSTR bands meant for deriving vegetation indices. These are provided in a plate carrée projection with a nominal 1km resolution at the equator. These are apparently meant as continuity data emulating some historic data product and are therefore of very limited use as a standalone data product. The one day and ten day products are packaged in regional crops for different parts of the world. Here a few examples showing simple NDVI calculations.
Apart from the retrofit character of these products with their compound spectral bands and the reduced resolution only publication the very idea of offering temporarily aggregated data products makes a lot of sense – some of the most popular MODIS products are, most of them either daily, 8-day, 16-day, monthly or yearly aggregates. But to produce a useful temporal aggregate product you first would need a solid base product to start with of course.
You have seen what products are available and what issues they come with. Depending on your application these issues might be a problem or they might not. If you are looking for an alternative data source for applications where you currently use MODIS data you are likely out of luck because for most of the widely used MODIS data products there is no real functional replacement in these data products.
Is there a chance for future new products or improvements of existing products improving this situation? Maybe. But i kind of have the feeling this is a pretty severely stalled situation. The SLSTR geolocation problem i pointed out in my first review is still unfixed – and this would be a simple bugfix as far as i can tell, nothing with serious political implications. With many of the design issues with the products discussed here it seems these are not just simple minor neglects, these are systemic planning problems – probably largely the result of the complete lack of familiarity with the very idea of open data products. This is a problem that might take decades to overcome – paraphrasing the famous Max Planck quote: The culture of open data does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die (or retire), and a new generation grows up that is familiar with it.
October 13, 2018
During the last days i updated this blog to a new WordPress version and a new PHP version which resulted in some unplanned downtime yesterday because of some messed up testing due to delayed configuration changes. But everything should be working now again. If there are any problems please let me know in the comments.
Installing this update i noticed that over the last years i have shown more than 1700 images here – to illustrate a screenshot of the media library:
October 10, 2018
I was preparing material for a talk i am going to give at the Integeo conference in Frankfurt next week and it reminded me of a topic i wanted to write about for some time here. The talk is going to be about the role OpenStreetMap and open geodata in general had and have for the development of digital cartography from a mere dematerialization of pre-digital workflows towards rule based cartography where the work of a cartographer is no more primarily the processing of concrete data but the development of rules defining the automated generation of a cartographic visualization from generic geodata. I previously presented this idea with a somewhat different focus back at the FOSSGIS conference in Bonn.
Thinking about this subject i remembered the realization i had some time ago that while the success of the OpenStreetMap project is usually attributed to the openness and the community production of the data this is only half of the story. I will go out on a limb here and say that – although i can obviously not prove this – the success of OpenStreetMap is at least to the same extend the result of OSM taking the revolutionary approach of producing a completely generic database of geographic data. In the domain of cartography this was completely unheard of. And i am not even sure if this was a conscious choice of the project at the beginning or if it was just the luck of approaching the subject without the preconceptions most cartographers had at the time.
And today it is my perception that it is not so much the volume of data, its quality or its free nature that makes even more conservative people in the field of cartography realize the significance of OpenStreetMap but its ability to maintain and widen its position in a quickly changing world with very little changes in the underlying base technology and with hardly any firm governance. There have been quite a few voices in the OSM community in the past few years criticizing technological stagnation within the project – a critique that is in parts not without basis. But one of the most amazing things about OSM is that despite such issues the project is able to manage the growth over the past 14 years without fully re-building the foundations of the project every few years like almost any comparable more traditional project would have had to. And there is no reason to assume that this cannot continue for the foreseeable future based on the same fundamental principles. Although i specifically only refer to the core principles of the project and not everything that developed around it.
All good you could think and proudly lean back but that is not the whole story of course. Since OpenStreetMap at the beginning was relatively alone with its revolutionary approach to cartography it had to do most of the things on its own and out of necessity became a significant innovative force in cartographic data processing. Later the huge domain of Open Source geodata processing and open data formats and service standards developed parallel to OpenStreetMap with also a few tools having OSM data processing as a primary initial use case so OpenStreetMap continued in many ways to drive innovation in cartographic technology (although you need to also give some credit to Google here of course).
With institutional cartography starting to adopt the ideas of rule based cartographic design these tools and the possibilities they offer are not exclusive to OSM any more though. While 5-8 years ago you could usually spot an OSM based map from a distance simply due to the unique design aspects resulting from the underlying technologies this is no more the case today. Map producers frequently mix OSM and non-OSM data, for example based on regional cuts, without this being noticeable without a close look at the data.
In other words: OpenStreetMap has lost its complete dominance of the technological field of rule based digital cartography. This is not a bad thing at all since OSM is not a technology project, it is a crowd sourced geographic data acquisition project – and in that domain its dominance is increasing and not decreasing. Still this development has a significant impact on the project because OSM does not operate in its own separate ecosystem any more it originally formed by being so very different from traditional cartography and where the only visible competition were essentially the commercial non-traditional cartography projects (Google, Here etc.). Now this field has both widened and flattened. And in this widened field there are other data sources used, in particular on a regional level but also global data sources generated using automated methods and crowd sourced data like from Wikidata as well as value added derivatives of OSM data and OSM competes with those on a fine grained level without there being that much technological separation any more due to different cartographic traditions.
As said the risk OpenStreetMap faces as a result of this development is ultimately not its position as an open geodata producer. The main risk in my eyes comes from the reflexes many people in the OSM community seem to react with to this development because they at least subconsciously perceive this as a threat. I see two main trends here:
I think these two trends – no matter if they are exclusively a reaction to the developments described before or if there are other factors contributing to this – are probably among the top challenges OpenStreetMap faces these days. As said the project’s core ideas (generic, verifiable geo-data based on local knowledge of its contributors) are solid and could likely carry the project for the forseeable future but only if the OSM community continues to put trust and support in these principles.
I will probably write separately in more detail about the anti-verifiability tendencies in OSM in a future post.
Another development related to this is that while in the OpenStreetMap ecosystem we have an almost universal dominance of open source software the world of institutional cartography is also strongly shaped by proprietary software. It is no coincidence that Esri a few months ago showed a map service based on proprietary software that clearly imitates the OSM standard style, which is kind of a symbol for rule based cartography in OpenStreetMap. It is clear that companies offering proprietary software will not stay away from rule based cartography. And with institutional customers they are not in a bad starting position here.
This is of course less of a problem directly for OpenStreetMap and more for the OSGeo world.