December 5, 2016
by chris

Sentinel-3 – a first look at the data, part 2

This post in a continuation of the first look report on the recently released Sentinel-3 level 1 data from the OLCI and SLSTR instruments. The first part provided general context and discussion of the general data distribution form, this second part will look in more detail into the data itself.

The product data grid

I mentioned at the end of the first part that the form the imagery is distributed in is fairly strange. I will here try to explain in more detail what i mean. Here is how the OLCI data looks like for the visual range spectral bands. I combined the single package data shown previously with the next ‘scene’ generating a somewhat longer strip:

Sentinel-3 OLCI two image strip in product grid

What is strange about this is that this is not actually the view recorded by the satellite. For comparison here a MODIS image – same day, same area but due to the different orbits slightly further east. This is from the MOD02 processing level, that means radiomentrically calibrated radiances but without any geometric processing, i.e. this is what the satellite actually sees.

MODIS MOD02 image with geometry as recorded

You can observe the distortion at the edges of the swath – kind of like a fisheye photograph. This results primarily from the earth curvature, remember, these are wide angle cameras, in case of MODIS this covers a field of view of more than 90 degrees. With OLCI you would expect something similar, somewhat less distortion due to the more narrow field of view and asymmetric due to the tilted view. We don’t get this because the data is already re-sampled to a different coordinate system. That coordinate system is still a view oriented system that maps the image samples onto an ellipsoid based on the position of the satellite. In the ESA documentation this is called product grid or quasi-cartesian coordinate system.

This re-sampling is done without mixing samples just by moving and duplicating them. For OLCI this looks like this near the western edge of the recording swath:

Sentinel-3 OLCI pixel duplication near swath edge

For SLSTR things are more complicated due to a more complex scan geometry. Here this also results in samples being dropped (so called orphan pixels) when several original samples would occupy the same pixel in the product grid.

I have been trying to find an explanation on why this is done in the documentation and also thought about possible reasons. The only advantage you really have is that images in this grid are free of large scale distortion which is of course an advantage when you work with the data in that coordinate system. But normally you will use a different, standardized coordinate system that is not tied to the specific satellite orbit and when doing that this re-sampling is at best a minor nuisance, at worst the source of serious problems.

Getting the data out

So what do you need to get the data out of this strange view specific product grid? For this you need the supplementary data, in particular for OLCI and geodetic_*.nc for SLSTR. These files contain what is called geolocation arrays in GDAL terminology. You can use them to reproject the data into a standard coordinate system of your choosing. However the approach GDAL takes for geolocation array based reprojection is based on the assumption that there is a continuous mapping between the grids. This is not the case here and this leads to some artefacts and suboptimal results. There is a distinct lack of good open source solutions for this particular task despite this being a fairly common requirement so i have no real recommendation here.

What you get when you do this is something like the following. This is in UTM projection.

Sentinel-3 OLCI image reprojected to UTM

To help understanding what is going on here i also produced a version with only those pixels in the final grid with a source data sample within them being colored. Here two crops from this from near the nadir point and towards the western edge of the swath:

Point by point reprojection near nadir

Point by point reprojection near swath edge

This shows how sample density and therefore resolution differs between different parts of the image and how topography affects the sample distribution. You can also see that the image is combined from the views of several cameras and the edge interrupts the sample pattern a bit. In terms of color values these edges between the image strips are usually not visible by the way, you can sometimes see them on very uniform surfaces or in form of discontinuities in clouds.

Apart from longitude and latitude grids the files also contain an altitude grid. One can assume this is based on the same elevation data that is used to transform the view based product grid into the earth centered geographic coordinates. So lets have a look at this. For the Himalaya-Tibet area this looks like SRTM with fairly poor quality void fills – pretty boring, but all right.

Altitude grid in the Himalaya region

But in polar regions this is more interesting.

Altitude grid at the Antarctic peninsula

Altitude grid in Novaya Semlya

Not much more to say about this except: Come on, seriously?

So far we looked at the OLCI data. The SLSTR images are quite similar but there are of course separate geolocation arrays for the nadir and oblique images. However it looks like the geolocation data here is faulty. What you get when you reproject the SLSTR data based on the geodetic_*.nc is something like this:

Coordinate grid errors leading to incorrect resampling

The problem seems to manifest primarily in high relief areas so it is likely users only interested in ocean data don’t see it. There is also a separate low resolution ‘tie point’ based georeferencing data set and it is possible that this does not exhibit the same problem. I would also be happy to be proven wrong here of course – although this seems unlikely.

Original product grid (left), longitude component of the coordinate grid (center) and reprojected image (right)

The observed effect is much stronger in the oblique view imagery by the way.

SLSTR geolocation error in oblique view (left: product grid, right: reprojected)

This overall severely limits evaluation of the SLSTR data. What can be said so far is that otherwise the data itself looks pretty decent. There seems to be a smaller misalignment (a few hundred meters) between the VIS/NIR and the SWIR data as visible in the following examples in form of color fringes. It exists in both the nadir and the oblique view data sets. All of the SLSTR images in the product grid look kind of jaggy due to the way they are produced by moving and duplicating pixels, combined with the curved scan geometry of the sensor.

Sentinel-3 SLSTR example in false color infrared (left: nadir, right: oblique), click for full resolution

Sentinel-3 SLSTR false color example from the Alps (left: nadir, right: oblique)

The supplemental oblique view image is an interesting feature. Its primary purpose is most likely to facilitate better atmosphere characterization by comparing two views recorded with different paths through the atmosphere. Another reason is that just like with the laterally tilted view you avoid specular reflection of the sun itself. You might observe when comparing the two image that the oblique view is usually brighter. This has several reasons:

  • The path through the atmosphere is longer so more light is scattered.
  • Since it is a rear view it is recorded slightly later with a slighly higher sun position.
  • Since the rear view is pointed northwards on the day side it is looking at the sun lit flanks of mountains on the northern hemisphere – on the southern hemisphere it’s the other way round.

Other possible applications of the oblique view could be cloud detection and BRDF characterization of the surface. Note since the earth surface is much further away from the satellite in this configuration resolution of the oblique images in overall significantly lower.

So far we had an overall look at the data and how it is structured, the main characteristics and how it can be used. In the third part of this review i will make some comparisons with other image sources. Due to the mentioned issues with the SLSTR geolocation grids this will likely focus on the OLCI data only.

Here a few more sample images. The first one nicely illustrating the asymmetric view with sun glint on the far right near the maximum in the southern hemisphere summer in northeast Australia.

And here some more samples of various areas from around the world.


December 2, 2016
by chris

Sentinel-3 – a first look at the data, part 1

When i recently wrote about the start of public distribution of Sentinel-3 data i indicated that i was going to write about this in more detail once SLSTR data is also available. This has happened several weeks ago already but it took somewhat longer to work through this than originally anticipated – although based on the experience with Sentinel-2 this is not really that astonishing.

This first part is going to be about the basics, the background of the Sentinel-3 instruments and the form the data is distributed in. More specific aspects of the data itself will follow in a second part.

On Sentinel-3

I already wrote a bit about the management side of Sentinel-3 when reporting on the data availability. Here some more remarks from the technical side. While Sentinel-1 and Sentinel-2 are both satellite systems with a single instrument Sentinel-3 features multiple different and independent sensors. Sentinel-3 is quite clearly a partial follow-up to the infamous Envisat project – something that is not too prominently advertised by ESA because it is a clear reminder of that corpse still lying around in the basement above out heads. Sentinel-1 could be understood to be a successor for the ASAR instrument on Envisat while most other Envisat sensors are reborn on Sentinel-3. I will here only cover the OLCI and SLSTR instruments which are related to the MERIS and AATSR systems on Envisat.

OLCI and SLSTR are included in my recent satellite comparison chart – here an excerpt from that in comparison with other similar systems:

All of these are on satellites in a sun synchronous polar orbit and record images continuously or at least for the dayside in resolutions between about 1000m and 250m. AVHRR is still running on four satellites but is to be preplaced in the future with VIIRS systems (which currently runs on one prototype satellite) GCOM-C SGLI is still in planning and not yet flying. Apart from spectral and spatial resolution the most important characteristic of such satellites is the time of day they record. Equator crossing time for Sentinel-3 is 10:00 in the morning which is closest to MODIS-Terra at about 10:30 (same is planned for GCOM-C). Since VIIRS is – both on Suomi NPP and on future JPSS satellites – going to cover a noon (13:30) time frame it could be that if MODIS-Terra ceases operations – which is not unlikely to happen with more than 16 years age – Sentinel-3 will be the only public source for morning time frame imagery in this resolution class.

That much for context. Now regarding the instruments: In short – OLCI is a visible light/NIR camera with 300m resolution in all spectral bands and SLSTR covers the longer wavelengths, starting from green, with 500m resolution (for visible light to SWIR) and 1000m resolution (for thermal IR). This somewhat peculiar division with overlapping spectral ranges is coming from the MERIS/AATSR heritage.

The most unusual aspect about the spectral characteristics is that OLCI features quite a large number of fairly narrow spectral bands at a relatively high spatial resolution of 300m.

OLCI records a 1270km wide swath. This means it takes three days to achieve full global coverage. For comparison MODIS covers 2330km and takes two days for full global coverage while VIIRS with 3000km coverage achieves full daily coverage. With the second Sentinel-3 satellite planned for next year OLCI will offer a combined coverage frequency comparable to MODIS. That being said OLCI is none the less very different because it looks slightly sideways. To cover 2330km from 705km altitude MODIS has a viewing angle from 55 degrees to both sides. OLCI on Sentinel-3 looks about 22 degrees to the east and 46 degrees to the west. In other words: it faces away from the morning sun coming from the east to avoid sun glint. This also means the average local time of the imagery is somewhat earlier than what is indicated by the 10:00 equator crossing time.

The SLSTR instrument is even more peculiar. It offers two separate views, one similar and overlapping with the OLCI view and 1400km wide, the other is more narrow (only 740km wide), not tilted to the west like the others but tilted backwards along the orbit of the satellite.

So what you get from OLCI and SLSTR together is three separate data sets:

  • The OLCI imagery, 1270km wide, view tilted 12.6 degrees to the west
  • The SLSTR nadir images, 1400km wide, likewise tilted to the west (so the ‘nadir’ is somewhat misleading) and fully overlapping with the OLCI coverage. The additional extent goes to the east so the SLSTR view is somewhat less tilted.
  • The SLSTR oblique/rear images, 740km wide, not tilted sideways

Here a visual example how this looks like (with SLSTR in classic false color IR):

Getting the data

But we are getting ahead of ourselves here. The publicly available Sentinel-3 data is available on the Sentinel-3 Pre-Operations Data Hub. What i am writing about here is the level 1 data currently released there. I am also not discussing the reduced resolution version of the OLCI data, just the full resolution OLCI and SLSTR data.

When you grab an OLCI data package from there it has a file name like this:

If you have read my Sentinel-2 review that probably looks painfully familiar. Excessively long, three time stamps, just in different order. First two time stamps are recording dates, the third one is processing. The footprints shown in the download interface like here:

are pretty accurate – at least after they fixed the obvious flaws when footprints crossed the edges of the maps after the first weeks. Don’t get confused when you frequently get every result twice (with different UUIDs but otherwise identical).

In the above file name i marked the components that are actually really useful in color. That is the date of acquisition, the relative orbit number and the along-track coordinate. The way the data is packaged is quite similar to Landsat (which is somewhat ironic since Sentinel-2 data is packaged much less like Landsat). The relative orbit number is like the Landsat path – just in order of acquisition rather than spatial order, the along-track coordinate is a bit like the Landsat row. Cuts along the track seem at roughly the same position usually and are made so that individual images have roughly square dimensions – but not precisely so numbers vary slightly from orbit to orbit. A number of other things are good to know when you look for data:

  • OLCI data is only recorded for the day side and only for areas with sun elevation above 10 degrees (or zenith angles of less than 80 degrees). This is quite restrictive, especially if you consider the early morning time frame recorded. Right now (end November) that limit is at about 60 degrees northern latitude. MODIS records to much lower sun positions (at least 86 degrees zenith angle AFAIK).
  • SLSTR is recorded continuously so you have descending and ascending orbit images. Only the thermal IR data is really useful for the night side of course.
  • The packages have no overlap in content.
  • Due to the view being tilted to the west in the descending orbit, i.e. to the right, images of both OLCI and SLSTR cover the north pole but not the south pole. The southern limit of OLCI coverage is at about 84.4 degrees south, SLSTR coverage ends at 85.8 degrees.

In the example shown the next package along the orbit is

- identical except for the times and the along-track coordinate.

Package content

When you unpack the zip you get a directory with the same base name and a ‘.SEN3′ at the end. When we look inside we might be positively surprised since we find only 30 files. For SLSTR that increases to 111 but still this is quite manageable compared to Sentinel-2. Of course – like with Sentinel-2, since the data is in compressed file formats packaging everything in a zip package is still very inefficient.

Image data is in NetCDF format. GDAL can read NetCDF and it can also read Sentinel-3 NetCDF but at least for the moment this is limited to basic reading of the data, any higher level functionality you need to take care of by hand. I assume GDAL developers will possibly in the future add specific support for the specific data structures of Sentinel-3 data but that might take some time.

What you have in the OLCI package is

  • One xml file xfdumanifest.xml with various metadata that might be useful, including for example also a footprint geometry.
  • 21 NetCDF files with names of the form Oa?? with ‘??’ being the channel number for the 21 OLCI spectral channels. This contains the actual image data.
  • Eight additional NetCDF files with various supplementary data.

It is great to have only 30 different files here so i am not complaining but the question is of course why they do not put all the data in one NetCDF file and save the need for a zip package? This is how MODIS data is distributed for example (in HDF format but otherwise same idea). I am just wondering…

In the SLSTR packages there is a bit more stuff but overall it is quite similar:

  • One xml file xfdumanifest.xml with metadata.
  • 34 NetCDF files for the image data of the 9+2 spectral bands with names of the form S?_radiance_[abc][no].nc or [SF]?_BT_i[no].nc with ‘?’ being the channel number. The ‘n’ or ‘o’ at the end indicates the nadir or oblique view as described above. ‘a’, ‘b’ or ‘c’ indicates different redundant sensors for the SWIR bands or a combination of them.
  • For each of the image data files there is a quality file with ‘quality’ instead of ‘radiance’ or ‘BT’ in the file name.
  • The other files are additional NetCDF packages with supplementary data – like with OLCI but different file names. Most of them are in different versions for the n/o and a/b/c/i variants.

Now if you pick the right ones from the 21 OLCI image data files, assemble them into an RGB image, which is fairly strait away with GDAL, you get – with some color balancing afterwards – something like this:

That might look pretty reasonable at the first glance but it is actually quite strange that the data comes in this form. More on that and other details on the data will come in the second part of this review.

While you wait here a few more image examples:


November 22, 2016
by chris

Parting the waters

At the recent Hack Weekend in Karlsruhe i made some progress on a project i had been contemplating for quite some time and with further work in the last weeks first results can now be presented.

The main motivation for this was the problem of representing waterbodies in the OpenStreetMap standard style at low zoom levels. For a long time the standard OSM map has been showing the coastlines at all zoom levels based on data processed by Jochen and me but other water areas only on zoom level 6 and higher. The reason for this limitation is that rendering them at the lower zoom levels in the same way as at higher zoom levels would be very expensive regarding computing ressources.

Various solutions – or better: workarounds – have been proposed for this problem:

  • Using a different low detail data set for the low zoom levels – this is the usual lazy approach taken by many maps but with poor results regarding accuracy and cross-scale consistency. Commonly used data sets for this purpose like Natural Earth are often of exceptionally bad quality by todays standards.
  • Applying aggressive polygon size filtering, i.e. only rendering the largest geometries – an approach that is not advisable for OSM data because the way water areas are mapped in OSM and that would also be highly biased and distorting.
  • Tagging large or important waterbodies in a different way, either as coastline or with a newly created tag – of course manipulating the database to address some technical shortcomings of the rendering system is not a very good idea.

Generally speaking my techniques for geometric generalization of map data do already solve this problem but the subjective choices involved in this make using such an approach a sensitive matter. And a decent generalization of inland waterbodies is not really possible without a structural analysis of the river network which is a time-consuming endeavour that can not easily be performed on a daily basis. So the approach had to be less expensive and also more conservative and neutral in its results. The solution to this i now implemented has been in my mind for quite some time but until recently i never really found the time to fully work this out.

Example of waterbody rendering using the new technique

Looking back at things now makes me realize that what ultimately came out of this project is actually fairly peculiar from a technical perspective. This is likely useful for many people who render digital maps at coarse scales – but the fact that this approach makes sense also says something fairly profound about the way we currently render maps and the limits of these methods.

If this sounds confusing you can read up the whole background story – there you can also find links to the (for the moment still somewhat experimental) processed data.

The implementation of the technique introduced there is also available.

November 11, 2016
by chris

Satellite comparison update

When writing about satellite images here i concentrate on open data imagery but i take the launch of a new commercial earth observation satellite by DigitalGlobe earlier today as opportunity to update my previously shown satellite comparison chart.

In addition to WorldView-4 i added the Terra Bella SkySat constellation. These satellites feature an interesting sensor concept and are meant to offer the currently unique ability to record video. There is however very little publicly known on operational plans for this system.

Also added is a column with the daily recording volume in square kilometers for the different systems. The most frequently advertised capability of satellites in addition to the spatial resolution is the revisit frequency which indicates how often a satellite can record images for any single point on the earth surface. A revisit interval of one day however does not mean the satellite system can record daily coverage of the whole earth surface. The recording volume describes what area can actually be covered on a daily basis. There are two variants: the potential recording volume (indicated in red) which is often a more or less theoretical value and the practical recording volume that is recorded in actual operations (in blue). In case of the commercial satellites both is based on claims by the operators obviously.

For the high resolution Open Data systems the numbers are determined based on average actual recordings. For Sentinel-2 this is a bit difficult so i based it on the specification of an average 14 minutes recording per orbit as specified in recent mission status reports. For the low resolution continuously recording systems numbers are – for compatibility – based on half orbit recordings although the longwave bands are of course recorded on the night side as well.

Generally marketing departments of satellite operators often seem to have a lot of fun twisting such numbers to make their systems look better. One thing to keep in mind here as reference: The earth land surface is about 150 million square kilometers. Landsat 8 records this almost fully (except for areas beyond 82.66 degrees latitude) every 16 days – but just barely. To do this it records approximately 25 million square kilometers every day or 400 million square kilometers in the 16 days global coverage interval. Now there are a couple of night side scenes included in that as well as quite a bit of ocean surface due to the width of the recording swath and the constraint of recording fixed WRS2 tiles. But it still illustrates that you need to record much more than just the 150 million square kilometers to actually cover the land surfaces in full, primarily due to the inevitable overlaps due to the orbit and recording footprint geometries.


October 28, 2016
by chris

The African glaciers 2016

To those who are not so keen about satellite image related topics i am sorry for the recent focus on that matter – there will be other subjects again in the future. But for now i have here another imagery related post – about the glaciers of Africa.

Readers familiar with glaciers probably know there are three areas on the African continent with glaciation – all in the inner tropics. Glaciers in the tropics are very special since there is not such a clear seasonal pattern of winter snow accumulation and summer melt like at higher latitudes. All African glaciers have shown a significant retreat during the last century to a tiny fraction of their original size and will likely vanish completely within the next fifty years.

The least extensive glaciation exists on Mount Kenya

The Rwenzori Mountains at the border between Uganda and the Democratic Republic of the Congo feature somewhat more extensive glaciers due to the much wetter climate despite these mountains being the least tall of all three. Formerly glaciers were found on several peaks in the area but now they are mostly limited to the highest areas on Mount Stanley.

And finally the best known and tallest glaciated mountain in Africa is Mount Kilimanjaro. A hundred years ago most of the main caldera was still covered by ice while now there are only a few patches left. Due to the altitude glacier retreat on Kilimanjaro has very little to do with climate warming and more with decreasing amounts of snowfall and increasing sun exposure due to sunny weather.

All three images based on Copernicus Sentinel-2 data.

October 26, 2016
by chris

I could have told you…

Short addition to the Sentinel-2 packaging and ESA API matter – three weeks ago i mentioned that

[...] Instead of browsing 200-300 packages per day you now have to deal with many thousands. This means the only feasible way to efficiently access Sentinel-2 data on a larger scale is now through automated tools.

And a few days ago:

Since ESA does not provide a bulk metadata download option you have to scrape the API for [obtaining bulk metadata].

Today the ESA announces that they think the recent API instability is largely caused by users attempting to retrieve extremely large numbers of results from queries and as a result they limit the maximum number of results returned by each query to 100. Three comments on that:

  • could have told them three weeks ago (maybe i should go into business as an oracle – although it really does not take a genius to predict that).
  • i doubt it will help much – it will create additional work for the programmers around the world to implement automatic followup queries to deal with the 100 rows limit – but in the end you will still have to serve out all the results, after all hardly anyone will do these queries for fun and there are now more than 160k entries to query – growing by about 4000-5000 every day (even though they now lag a week with publication of Sentinel-2 imagery). Systems like this which obviously do not scale to the level they are used at fail at the weakest link. But fixing or protecting this point of failure does not mean it will not fail at another place again. If the whole download framework goes down just because of a few unexpected but perfectly valid queries that is indication for a much more fundamental design problem.
  • even if it is so last century since today we all live in the cloud – offering bulk metadata download with a kind of daily/hourly diffs for updates might not be such a bad idea.

By the way (in case you wonder) – it was not me, when writing my scripts for creating the coverage analysis i was already working with a hundred rows limit – this was already the documented limit for the OData API anyway so it seemed appropriate to also use it in general.


October 22, 2016
by chris

Sentinel-2 – the first year

Sentinel-2 is an earth observation satellite mission producing open data imagery, the first satellite of which was launched in June last year and for which the data started to become available in November last year. The Sentinel-2 design is quite similar to Landsat and with a slightly higher spatial resolution of 10m it produces the highest resolution open data satellite imagery available at the moment.

This is to be a review of the image coverage generated by Sentinel-2 during the first year of operation. But for reference i will start with Landsat.


Landsat 8 has now been operating for more than three years and as indicated before it is essentially recording a relatively bias free global coverage of the land surfaces at a 16 day interval now. To illustrate here a plot of last years day side (descending orbit) acquisitions. This is generated by plotting the WRS2 footprints in a color representing the number of scenes available. This is fairly easy to do using the bulk metadata packages made available by the USGS.

The timeframe is not chosen arbitrarily, a mid October cut includes exactly one full northern hemisphere and southern hemisphere summer season.

As you can see for low latitudes the map is quite uniformly colored indicating 21-23 acquisitions on most major land areas, corresponding to the 16 day interval. At high latitude not all possible slots are recorded but as the denser lines already hint this does not necessarily mean less frequent image coverage. To better show the actual image recording frequency here a different visualization indicating the per-pixel image count for all pixels that include land areas.

Here you can see that the high latitudes are actually more frequently covered in most cases – which is not a bad choice since cloud incidence here is often higher than in the subtropics.

A few interesting observations can be made here. First you can see the off-nadir recordings in northern Greenland and the Antarctic extending the coverage and reducing the area not covered at all, which is drawn in blue. This is the central Antarctic of course but there are also two pixels further north that are blue:

  • Rockall – which is not really large enough to be recorded on Landsat imagery in any meaningful way.
  • Iony Island – which is a clear omission since other smaller islands elsewhere are specifically covered by Landsat.

Here this island on a Sentinel-2 image:

What Landsat does not see – Iony Island by Sentinel-2

Apart from these the lower latitudes are recorded at every opportunity so there is not much that can be improved here. At the higher latitudes there is however still quite a bit of focus on some areas and neglect of others. In particular the islands of the Kara Sea and the East Siberian Sea as well as Bear Island and Hopen south of Svalbard are significantly less often covered than other areas at the same latitude. This is of course in a way a matter of efficiency since recording a scene containing only a small island is less useful overall than recording one that fully or mostly contains land surfaces.

For comparison and completeness here are the same illustrations for the previous years:

year day night day pixel coverage
2014 LS8, LS7 LS8 LS8
2015 LS8, LS7 LS8 LS8
2016 LS8, LS7 LS8 LS8


Now to the main subject, the first year of Sentinel-2. Public availability of images started in November but quite a few images were already recorded earlier and made available over time so i will use the same end cutoff as for Landsat (mid October) and an open beginning and call it a full year. This results in a bit of a bias towards the northern hemisphere since summer images for it are included for both 2015 and 2016.

Technically doing this for Sentinel-2 is much harder than for Landsat for multiple reasons:

  • Since ESA does not provide a bulk metadata download option you have to scrape the API for this.
  • The API – in addition to being very unreliable recently is also not always returning all the matching packages for a query – at least not the search API, the OData API is different.
  • Not all packages in the ESA database have footprint data.
  • Packages extending over the 180 degree meridian have invalid footprints.
  • As mentioned in my initial Sentinel-2 review the scene footprints of the early data sets were mostly bogus. ESA fixed that in July but many of the earlier packages have not yet been replaced.
  • Some early packages from 2015 have incorrect relative orbit numbers.

All of this together means quite a lot of effort is required to generate a halfway accurate coverage analysis. The following has still quite a few limitations:

  • It only includes descending orbit images. Since Sentinel-2 does not have thermal infrared bands nighttime acquisitions are fairly rare anyway – so because of the complications i left them out.
  • Images with missing footprint data or wrong relative orbit numbers are excluded (a few hundred overall).
  • Images with old style bogus footprints were clipped to the recording swath limits but not in orbit direction. Therefore coverage is systematically overestimated, in particularly in areas with relatively fine grained acquisition patterns. This can well be seen in the smaller green areas in America and Asia which in reality are significantly smaller than indicated by the illustration here.

Still overall this should give a fairly precise image of the coverage patterns and recording priorities of Sentinel-2 during the first year of operations:

As it is already widely known Sentinel-2 operations do not aim for a globally uniform coverage. Focus is – at least for the last year – on Europe, Africa and Greenland. With a 10 day revisit interval Sentinel-2 could theoretically record 35 images per year for any single spot on the earth surface – although volume wise probably not for all the land areas at the same time. Understandably during the first year it did not reach this outside the overlaps anywhere although for the priority areas (Europe and Africa) it did match or even exceed the recording frequency of Landsat. Elsewhere however recordings were more sporadic and the smaller focus areas seem fairly arbitrary, probably meant to cater certain vested interests. This is particularly striking in the Antarctic where the interior is mostly uncovered but there is a smaller mostly featureless area on the East Antarctic plateau which has been repeatedly recorded for some reason.

Apart from the Antarctic and northmost Greenland completely uncovered areas are mostly smaller islands.

What the image does not tell is how well time slots are managed in the areas where images are not recorded at every opportunity, in particular with regards to cloud cover. With Sentinel-2 the area where this applies is much larger (i would estimate about two thirds of the land surfaces compared to about one third with Landsat). My gut feeling is that the USGS is doing better here – which would not be surprising considering the much longer experience. It might be interesting to look at the cloud cover estimates of the images in comparison although these are fairly unreliable and produced with different methods so in the end probably not really that meaningful. And as mentioned before the larger image size of Sentinel-2 also means cloud cover based scheduling is more difficult here.


So what is to be expected for the next year. For Landsat not much is likely going to change. For Sentinel-2 the current statement is

Sentinel-2A is acquiring Europe, Africa and Greenland at 10 days revisit, while the rest of the world land masses defined in the MRD are mapped with a 20 days revisit time.

As indicated previously such statements, especially regarding the MRD, have to be taken with a grain of salt. The above would mean a significant increase in the recording frequency overall as well as a more uniform global coverage – the past year as illustrated above showed in parts more than a 1:2 difference between Europe and Africa on the one side and America and Asia on the other.


October 20, 2016
by chris

Sentinel-3 data – better late than never

In February i reported the launch of the Sentinel-3A satellite and now the data finally starts to get available to the public.

According to the published schedule we now get the OLCI level 1 data, the SLSTR level 1 data is going to follow in November and higher level products are indicated for some time next year.

Color rendering of one of the first public Sentinel-3 OLCI data sets

Some readers who follow earth observation satellite deployments and operations might be astonished since Sentinel-3 images have been shown quite frequently by various parties for months already. This is because since May data has already been made available to so called expert users. None of the satellite image experts i know of is apparently part of this illustrious circle – indicating the expertise to qualify for this is different from what is usually understood as being an expert. This data was called ‘sample data’ but unlike normal sample data this was produced on a regular basis for the whole time – a somewhat creative use of the term ‘sample’.

This whole procedure is quite remarkable since the regulatory requirement of the Copernicus program is clearly that all data from the Sentinel satellites will be made available to everyone without access restrictions beyond basic registration. By declaring the satellite – or more precisely the data processing system since the satellite passed its in-orbit commissioning review in July – not yet operational this requirement is apparently circumvented.

You might call this paranoia on my part but the signs that a lot of influential people around the Copernicus program are largely uncomfortable with the whole open data aspect is quite visible in many things. For example look at the dates for the different Sentinel satellites launched to date – there is a clear trend visible:

  • Sentinel-1A: Launch 3 Apr 2014, public data access since 9 May 2014 (1 month)
  • Sentinel-2A: Launch 23 Jun 2015, access since end November 2015 (5 months)
  • Sentinel-1B: Launch 25 Apr 2016, access since 26 Sep 2016 (5 months)
  • Sentinel-3A: Launch 16 Feb 2016, partial access since 20 Oct 2016 (8 months)

Of course the first data released for Sentinel-1A was highly experimental and there were massive changes in the whole data distribution system afterwards. But that’s natural considering the lack of experience with public data distribution.

In defense of those in charge – the volume of data that needs to be made available for download is quite significant. For Sentinel-2 there are about 200-300 ‘scene’ packages per day (the old 300km packages, not the new single granule ones) which – with an assumed average size of 5GB amounts to about 1-1.5TB per day. For Sentinel-3 estimates are 28.5GB (OLCI) and 44.5GB (SLSTR) per orbit which amounts to more than 1TB per day for the basic level-1 data – not counting higher level products or the different near-real-time and long term versions. And the near-real-time product is supposed to be made available within three hours of recording. Given the difficulties that already showed regarding reliability of the data distribution with Sentinel-2 it is not really astonishing they are not too eager to open this fragile infrastructure to the public with even more data. But the ressources that go into the public data distribution component of the Copernicus program of course reflect the importance this has for those in charge – and it is not that the requirement to scale to this level was not already clear several years ago.

The situation might also be related to the fact that Sentinel-3 unlike Sentinel-1 and 2 is operated by EUMETSAT and not ESA (though ESA will distribute parts of the data). I wrote about the lack of an open data culture in ESA before but due to their scientific mission they at least superficially have to try to appear open.

EUMETSAT on the other hand is an intergovernmental organisation for running weather satellites for the European national weather services. Since it is inefficient for each of the countries in Europe to operate their own weather satellites they joined together. The concept of EUMETSAT is that countries joining it get free access to the satellite data for the national weather services for their own use. They are also allowed to license this data to third parties as licensing agents but are bound to the fees set by the EUMETSAT management. So essentially EUMETSAT has two purposes:

  • cost reduction by operating weather satellites together for the national services
  • monetarizing on the data produced by selling it to commercial users as a cartel

As a result policies for European weather satellite data are among the most restrictive world wide because it is explicit policy of the operator to try monetarizing on any uses of the data for non-governmental purposes – contrasting quite extremely for example to the Japanese with Himawari 8 which has full free public access to the data.

You might notice the irony in such an organization now being put in charge of operating a satellite with mandataory full open data access. Sentinel-3 does not compete directly with geostationary weather satellites of course but EUMETSAT is also operating other polar orbiting satellites.

Regarding the data itself – i have not looked at it in detail yet, will probably do a more thorough review later but likely not before also the SLSTR data is available. Since it comes in a non-standard form (netCDF format without normal georeferencing information) it will be fairly complicated to use with normal tools. What can be said so far is however that what is available is fairly incomplete. Since OLCI is visible and NIR wavelengths only night acquisitions are not of much use but they do not even have full coverage of the sunlit part, compare the footprints:

with the current Terra-MODIS coverage:

The more narrow swath and larger gaps near the equator are normal and expected but the fact that coverage ends fairly early towards the poles is not. If this is based on a sun elevation cutoff or some other criterion is unclear to me – just like if this is only a limitation to the data made available or if this represents the actual acquisitions. This will however likely limit data at high polar latitudes to a very small time frame of just 3-4 months per year.


October 3, 2016
by chris

Who is moving the cheese again?

In the field of open data satellite images there have been a number of changes recently that amount to moving the cheese. These changes relate to Landsat and Sentinel-2 data. Here in more detail what changes and a few comments on the context.


The USGS in end September started changing their primary distribution form for Landsat data to what they call Collections. This essentially means

  • introducing different processing versions in explicit form. So far reprocessing a scene simply replaced the existing package. You could identify the processing version in the scene metadata but not in the package name. Having the processing date in the package identifier makes this more transparent but it also forces data users to handle this.
  • introducing quality assessment levels. This essentially means some scenes are verified to conform with higher quality standards, apparently in particular concerning geometric accuracy and this is indicated in the package identifiers and available scenes will be prominently classified into these classes in the download interfaces.
  • introducing some additional metadata.

This change is done gradually – they right now move to this for Landsat 5 and 7 data and are planning to start with Landsat 8 in November. The whole reprocessing will apparently take several months. The old distribution form will stay available during this including new scenes so there is plenty of time to adjust to this. And the new distribution form is apparently mostly backwards compatible except for the different file names so it should be fairly simple to deal with.


Now the Sentinel-2 changes are a whole other story.

First on September 19 ESA turned off the registration free distribution system for Sentinel-2 data. This means to access Sentinel-2 data you now have to register with ESA. Not a big deal, this is an automated registration. But not really that convenient if you casually want to try out the data.

Then in late September they moved data distribution from the previous scene based packaging to single granule packages. This change was pre-announced in early August. As i explained in my initial Sentinel-2 data review the original distribution form for Level 1C Sentinel-2 data (which is the only processing level distributed) was packages each containing a 300km section of the satellite’s recording swath – which is about 290km wide. These packages were – when images are recorded for the full 300km length – usually about 6 to 15 GB in size – depending on latitude since the 300km segment cuts were in latitude direction.

Apart from the larger size (due to both the higher spatial resolution and the larger footprint) and the different internal organization of the packages these were quite comparable to Landsat scenes. But apparently quite a lot of users found the large packages somewhat inconvenient so ESA is moving to distributing single granule packages now. The term granule is how ESA calls the 100x100km tiles the data is structured into internally which corresponds to a modified version of the MGRS system. Each package now contains exactly one of these tiles and the 300km length scenes which commonly contained about 10-15 of these granules are history. This might not seem such a large deal – just splitting the same files into several smaller packages and ESA also announces it as such. But this has quite a few implications:

  • There is quite a bit of added redundancy between the packages since you now download a lot of the metadata and supplementary files 10-15 times which you previously got only once
  • Since ESA continues to generate their preview images with individually adjusted tone mapping you now have to deal with even larger and more fine grained arbitrary differences in the preview images. These were already quite difficult to use for assessing image quality and despite the larger size (the previews for a single granule are now the same size as they were for a whole scene) this got worse. On the bright side you can now – with some trickery – approximately geocode the preview images.
  • Although ESA distributes individual granules now they apparently do not consider it necessary to add to the package metadata information on which granule a package contains. You have a footprint polygon but no information on the MGRS tile or UTM zone.
  • For larger scale data users who do not just deal with individual granules or want to casually see some images for a specific location the ESA download interface (and similarly the various alternative browsing tools around which are based on the ESA structures) are now essentially unusable. Instead of browsing 200-300 packages per day you now have to deal with many thousands. This means the only feasible way to efficiently access Sentinel-2 data on a larger scale is now through automated tools.
  • ESA has already before fairly randomly reprocessed images resulting in duplicate packages in the archive. I don’t know to what extent this leads to different image data. But with the same happening now on a much more fine grained level this issue is much more acute. Like you download a granule in the last processed version available and then you also need a neighboring granule but see this is only available in an earlier processing. Will this lead to a difference in data at the edge between the granules?

Regarding the preview images – here an example for a preview of one of the old style packages:

old style Sentinel-2 preview image

And here the same area with newly processed data assembled from the previews of the new single granule packages:

assembly of previews of new style packages

Well – it is better than nothing, which is not really a compliment though…

But the story does not end here. Since this change was implemented access to the ESA download systems has been fairly erratic – who would have guessed that offering about 10 times the number of files for download and also serving metadata and query services for all of these puts additional strain on the infrastructure. Today they announced that unreliable and delayed data access will likely continue throughout October.

No matter what the reasons and motives for all of this are – the prospect that Sentinel-2 could turn into a reliable and dependable open data alternative for Landsat just became significantly less likely. Considering the amount of tax money that went into that this is more than just a bit sad.

I am generally inclined – in line with Hanlon’s razor – to put most of this on incompetence. The various obviously not well thought through aspects in the ESA data distribution and tools, like for example the preview image tone mapping, underline this. But the possibility that increasing difficulties of routine use of Sentinel data through the venues available to the general public is actually intentional on some level is not all that far fetched on an overall look.


September 24, 2016
by chris

More images for mapping

Just put up a few additions to the OSM images for mapping:

A number of aerial images from about a year ago from Operation Icebridge overflights of the Thule Airbase (where most of these flights are conducted from). In contrast to the previous images these are partly without snow.

There are more such images from late Autumn from different parts of Greenland but they only cover small strips that are fairly insignificant is area compared to the whole country. With still plenty of things to map from the lower resolution imagery there is little use in adding them.

Next is a fairly large image of the Northern and Polar Ural mountains in Russia. This is very badly covered in Bing and Mapbox image layers. This image from Sentinel-2 was taken in August.

This should be helpful in mapping for example lakes, rivers, glaciers, ridges and cliffs but note although this is a late summer view not every white patch in it is a glacier. Also visible are roads, settlements and mining areas.

And finally there is a small image of Ushakov Island – not really much to see there but it can be used to update the coastline. Also you can see one of the meltwater lakes on the ice has drained.


September 17, 2016
by chris

Sun glint

In connection with assembly of satellite image mosaics i mentioned the effect of sun glint in satellite images in a recent post and want to elaborate a bit on this here.

Sun glint is a common phenomenon in satellite images – it essentially refers to the specular reflection of the sun on water surfaces. It is more or less the same as the sun reflection you see on a curved glass surface.

From the perspective of an earth observation satellite in sun synchronous orbit with a morning view (both Landsat and Sentinel-2 fall under this) sun glint looks like this:

This image shows a single day’s imagery of the Terra-MODIS camera from May this year. You can see the different orbits of the satellite each crossing the equator at the same local time and the small gaps between. Since the satellite photographs a morning view the sun is in the east and therefore the sun reflection is slightly to the east from the middle of the image swath. Now higher resolution satellites like Landsat and Sentinel-2 photograph a much more narrow field of view so sun glint is normally visible as a bright overlay of the water areas increasing in intensity from west to east. Here an example from Sentinel-2:

And here one from Landsat 8:

Since Sentinel-2 has a larger field of view than Landsat and a slightly later local imaging time (10:30 equator crossing time compared to 10:11) sun glint effects as well as their variability across the images are significantly stronger than with Landsat on average. The elongated form of the sun glint in flight direction is caused by the way satellites record images by scanning lines at a right angle to the direction of flight. The angle between view direction and the earth surface varies along this line but is not tilted backwards or forwards in flight direction so the glint varies strongly at a right angle to the flight path due to the view direction rotation across the field of view and changes much slower in flight direction due to just the varying orientation of the curved earth surface towards the sun.

Because of the way sun glint is affected by the view geometry as described it is also strongly subject to striping in the images due to the image sensors being split into several modules in most recent earth observation satellites each of which looks either slightly forward or backward and so they are affected somewhat differently by sun glint. Here an example from Sentinel-2 showing how this typically looks.

In the first example from the Canary Islands by the way striping is not visible because this image is taken near the maximum of the sun glint and therefore the forward and backward looking sensor modules record more or less the same amount of glint.

Sun glint is generally strongest at the latitude where the sun is highest depending on the seasons. It falls off north and south of this. During mid summer the effects of sun glint can be clearly seen up to about 50 degrees latitude with Landsat, somewhat further with Sentinel-2. Correspondingly during Winter sun glint free images can be recorded at lower latitudes as well.

Since sun glint is caused by a pretty basic geometric constellation you could assume that you could compensate for it but practically this is hardly possible because of several things:

  • Its strength and characteristics are highly dependent on how smooth the water surface is, i.e. waves. On the bright side this has the nice side effect that you can observe the sea state quite well on images with stong sun glint – as demonstrated by the Landsat example above.
  • Water areas are generally quite dark when viewed from above especially deep and clear water so the sun glint outshines the real reflection signal and even if you could properly estimate and compensate for the amount of sun glint you would still have the actual signal buried in the noise from the specular reflection.

Generally sun glint is usually considered an undesirable effect although not really a flaw or quality deficit like clouds. Practically it is one of the main reasons why you rarely see satellite image products that include water coverage of larger water areas at lower latitudes since it is difficult to uniformly render water in areas where sun glint is present.

You might wonder how the Green Marble renders the ocean without visible sun glint. This is because the wide field of view of the MODIS instrument contains sufficient data far enough from the directions of sun glint and this data is used to determine the color in such areas. But this is not without problems either – as you probably know from looking at water surfaces from the shore reflectivity increases when you look at the surface at a flatter angle – so you might get less sun glint this way but on the other hand get a larger amount of reflected skylight.

What could help dealing with sun glint in imagery is if satellites were able to record polarization information in the light received. Specular reflection is selective regarding polarization after all. But common earth observation satellites do not do this at the moment.

Some further info and literature on sun glint can be found here.


September 15, 2016
by chris

OpenStreetMap at its worst

To get this out upfront: this post is not about data quality – what i am going to talk about here is the mechanisms how OpenStreetMap functions and how these principles can break down.

In a nutshell – how OpenStreetMap works is that everyone can contribute to the database and thereby create something valuable others can use, in maps and other applications using OSM data. The real key however is that when you do so others contributing in the same area will then build upon your work, supplementing it with additional detail, updating information, correcting inaccuracies etc. This is what makes OSM attractive to contributors – you know your contribution matters, often even many years after you make it because you can be sure subsequent contributions will be supported by the basis you created. And it is attractive for data users because through this mechanism the result is usually significantly more valuable than the sum of all the individual contributions.

Those who know me might already imagine that when i talk about these principles breaking down i am talking about Canada, in particular about the Canadian North, the area you might best recognize from the distorted appearance in the Mercator projection:

The Canadian Arctic in OpenStreetMap

This region is one of the most sparsely populated areas of the northern hemisphere – Nunavut and the Northwest Territories have a combined population of less than 100000 the vast majority of which live south of the area discussed here which has likely less than 10000 inhabitants.

This makes it a fairly difficult area for mapping in OpenStreetMap. This is how this area looks like in terms of OSM node density – in a different, less distorting map projection:

I separated the data into three categories:

  • legacy imports of coastline and larger waterbodies, mostly PGS, made about 8 years ago and not touched since then are shown in blue – overall 1.3 million nodes.
  • unmodified Canvec data imports are shown in red – about 5.2 million nodes.
  • everything else, meaning hand mapping as well as any imported nodes that have been modified afterwards are shown in green – about 500 thousand nodes.

Now if you ignore the red you could get the impression this looks reasonably healthy considering the remoteness of the area. If you look at the age of the data:

you can see most manual mapping activity is fairly recent and limited to smaller areas. The Canvec import stuff is shown in gray since node age and data age are not the same for imported data of course – Canvec imports were made during the last five years mostly. Now you cannot compare this to a densely populated area in Europe of course, there is very little local mapping on the ground and nearly all of the data – both imported and manually mapped – is produced remotely. But lets compare it to Greenland – an area with quite comparable population density, accessibility and geography:

When magnified both maps have the same scale by the way. Compared to northern Canada Greenland has much earlier and more extensive manual mapping activities. There are likewise legacy imports of coastline data, in particular at the west coast. Overall the data volume is comparable if you disregard the Canvec imports, Greenland is about 2 million nodes in total, legacy imports and manual mapping together is about 1.8 million nodes in the shown part of Canada.

So what causes the difference? The fairly obvious explanation are the Canvec imports. Except for the legacy coastline imports from many years ago there have been no imports of data in Greenland. If you look at the maps above and the data you can see while the manual editing activity supplements and replaces the legacy imports quite freely there is hardly any interaction between manual mapping and Canvec imports. About 200k of the 500k manual nodes have been edited after initial creation (are version >=2), most of these are manually refined legacy import coastlines. Significantly less than one percent of the Canvec import nodes have been edited afterwards and most of the manual editing activity you can see in Canvec import areas is simple mechanical cleanup. If you have ever tried doing manual mapping in an area where Canvec data has been imported you know why this is rare – i did so once in the far north and this is not something you really want to do. Canvec imports are essentially a foreign body in OSM regarding normal editing activites which then try to operate around these.

Remember above i wrote than OpenStreetMap works by contributions supporting and forming a basis for further subsequent contributions. Canvec data imports do not work this way, especially not in the high Arctic. In addition they neither work the other way round, i.e. by integrating and making use of previous manual contributions. If at all such imports bury previously mapped stuff under tons of data of questionable quality. And the outlook of this happening is not exactly an incentive for mappers to contribute, especially not if they can also do so a few hundred kilometers further east where no such problems exist.

Now i wrote initially this is not about data quality but still i want to deal with one of the key arguments of Canvec import proponents: that the Canvec data is of good quality and much better than what can in most cases be manually mapped from available imagery sources. This is wrong. Canvec data in this area is in most parts somewhat more detailed than available image sources but it is worse in about every other aspect:

  • it is less up to date which is of particular importance in the Arctic due to glacier retreat and climate change. Most of the source data Canvec is based on in this area is at least ten years old, significant parts are much older (like 1980s).
  • it is often factually incorrect, partly due to incorrect original mapping, partly due to incorrect attribute conversion.

Everyone who does not believe that i would highly recommend to look at the recent images from the OSM images for mapping in the area and compare them to the Canvec data.

Due to these problems the imported data does not even give valuable hints to mappers unfamiliar with the area how to map things – on the contrary it suggests incorrect tagging in many cases.

Another argument frequently brought up is that having additional data in the OSM database is an advantage on its own. In reality this hardly is the case – if data users find the Canvec data useful it is generally much easier to take it directly from the source where it is available in uniform quality and with all original attributes for the whole country. And if you consume data on this scale the slight possible advantage of having it in the OSM format you are used to already is usually not significant.

Long story short – the only way the Canadian OSM community could in the long term make sure the Canadian North is a valuable part of the OSM database and an area where it feels to be rewarding to contribute for mappers would be to put a stop to Canvec imports in the area and make an effort in removing the previously imported data. Otherwise the Canadian Arctic will likely continue to fall back behind the rest of the world in terms of community building as well as data usefulness – not despite the imports but because of the imports.

Some probably read into this i am generally against data imports in OSM but i am not – the key question for such however has to be if they support further mapping in OSM in the area of the import or not and in this case the answer is quite certainly no.

Now one thing i asked myself in the matter is if this is actually a deeper cultural difference between Europe and America, the old world and the new world. Being from Europe i am probably not unbiased on this – despite extensive experience in mapping in the Arctic. It is possible that what i wrote about mapper motivation and incentives applies to the typical European mapper but not the North American one. Since much of the manual mapping in Northern Canada is done by people from abroad even what can be observed from a neutral standpoint could be distorted in that direction. OpenStreetMap is built upon the principle of primacy of the local mappers – they decide on their own how things are mapped in their area and if data is imported. But is someone sitting in Toronto, Montreal or Vancouver really more of a local mapper on Devon Island or Ellesmere Island than someone from Britain, Germany or Russia?