Imagico.de

blog

fossgis_2017_980

March 20, 2017
by chris
2 Comments

FOSSGIS 2017 and talk announcement

On Wednesday this week the annual FOSSGIS conference is going to start in Passau and I look forward to be there and meet people and talk about free software and geodata. On Friday morning i am going to present a talk on free satellite images giving an introduction on what is available today in terms of open data satellite imagery for everyone to use.

As a teaser here a few images i am going to show as examples. These can be used under CC-BY-SA license, you can also print them in large if you want to. Be careful – these are fairly large image files with more than 6000 pixel in size.

Verkhoyansk Range, eastern Russia 07-09-2016 by Landsat 8 – full size image

Patagonia, Chile 15-03-2017 by Landsat 8 – full size image

Mt. Katmai/Novarupta, Alaska 13-09-2016 by Sentinel-2 – full size image

As a bit of a side blow here some links to what the usual suspects offer in these areas – Patagonia, Siberia, Alaska

Those who want to come by spontaneously can do so – Entry is free for those active in Open Source software and free geodata.

Update: Video of the talk is now available on youtube, a PDF of the slides will also be linked from the program later.

icebridge_southpole

March 17, 2017
by chris
0 comments

Lost and Found

After cleaning up remaining broken multipolygons in OpenStreetMap in the Antarctic as part of an ongoing effort to fix broken geometries – by the way I am happy to report Antarctica is now the first continent in OpenStreetMap without serious multipolygon errors or old style tagging of multipolygons – i also did a traditional cleanup round around the poles to remove bogus data.

The phenomenon of accumulating garbage in the OSM database is best known from Null Island. But sometimes the data also turns up around the poles and accumulates there because people rarely ever look there and clean up. Areas beyond the Mercator map limit slightly beyond 85 degrees latitude do not turn up in most QA tools and editors so they are kind of invisible to normal OSM activities. I have not done such a cleanup for some time and apparently others have not either so here are a few highlights of things that got lost there.

Interestingly if you look at the changesets which created those – most of them were created with iD – although to my knowledge iD works exclusively in Mercator projection. So you apparently can generate data near the poles with iD but you have no way to edit it after you have done so. Here a few changesets as examples:

33412666, 39331559, 42690278, 41522831, 39111625, 16380595, 37911018

Now the area around the south pole is pretty empty again. It is difficult to map here because although JOSM can meanwhile be set to use polar projection (EPSG:3031 for the Antarctic if you want to try) none of the usual image sources covers this area. If you want to do some mapping around the south pole here is an Icebridge image from a few months back. You can download and use it in JOSM using the ImportImage plugin after setting the projection to EPSG:3031. What does not work in this case is getting the existing OSM data via API – you need to use Overpass/XAPI for that.

To identify the different things you can see on the image – here is a plan of the area and an annotated oblique image.

Note the Icebridge image is from very early in the summer season so there is relatively little visible except the permanent structures and traces of past activities have been largely covered by snow and wind during winter.

Another thing to keep in mind: everything there is located on ice that moves by several meters every year so it is important when you map things to specify the date of the location information – in case of this image October 2016.

EO1A1050702017052110KF_expose.ann

March 4, 2017
by chris
0 comments

EO-1 unending

When i wrote my eulogy on EO-1 i closed mentioning that final decommissioning is scheduled for late February but after having defied odds for more than 15 years it is only fitting for this satellite that this was not the final word on the matter. The new date is now announced to be around March 20 – lets see how this is going to work out. In the meantime enjoy more spectacular early morning views.

S2A_R024_S56_20170213T140721_expose.ann

March 3, 2017
by chris
0 comments

The truth about true color images

I have been meaning to write a piece about this for some time already and a post on the OpenStreetMap user diaries reminded me about that. Recent changes in the distribution form of Copernicus Sentinel-2 images – which I wrote about in relation to other aspects previously – also introduced something I did not write about yet – the full resolution True-Colour Image. I did not discuss this because it was not of much concern for me. I disliked the additional download volume but otherwise this would not cause any problems. Later I however realized that for many beginners in using satellite imagery this will probably have a much higher impact because they will often be inclined to actually use it and might even view the data exclusively through this image.

The True-Colour Image is essentially a large, full resolution version of the preview images you can find in the ESA download application and can also query through the API. I made critical remarks about the rendering of these previews before and those essentially apply to the full resolution version the same way. The developer who planned and implemented generation of these images quite clearly did not know much about either satellite imagery or color representation in computers in general on the current technical level.

What today’s satellites – including Sentinel-2 – produce as raw imagery is pretty high quality data, not only in terms of spatial resolution but especially also in terms of dynamic range and low noise levels. Even if you just look at the true color channels, i.e. red, green and blue, this data cannot be fully reproduced on a computer screen, you need to compress the dynamic range available into the range supported by computer displays. Doing this is not a simple task, it requires knowledge of color representation, image processing and color physiology and ideally it takes into account what you want to use the image for. Still the way this is done with the Sentinal-2 True-Colour Images is about the worst possible way this can be done. Not only does this immensely reduce the usefulness of these images for the user, it also significantly sells short the quality of the underlying data.

Here an example from Patagonia near the southern tip of South America:

For comparison a custom rendering produced by me from the raw data:

Clearly visible are the clipped highlights in the first image which makes it hard to distinguish clouds from snow and the fairly structure-less shadows where you can hardly see anything. Both of these are not problems of the data but of the processing applied – as evident in the custom rendering.

Now you might say that it is obvious when you compare a static processing with one specifically adjusted for the setting but this is not the point here – you can do much better even with a globally uniform rendering applied to all images identically. And using the poor rendering of the color composite images in the Sentinel-2 packages you loose a lot of valuable information that is actually in the data. Or to look at it from a different perspective – apart from the higher spatial resolution you could produce this kind of rendering also from a 1980s Landsat 5 image.

The color fringes around the clouds are not a processing artefact by the way but due to the way the satellite records images.

s2sval_980

February 9, 2017
by chris
0 comments

North Atlantic island images

Over the winter I processed various new satellite image mosaics making use of new data from the 2016 summer and I am pleased to introduce some of these here. As usual you can find more detailed information on services.imagico.de and you can contact me in case you are interested in using one or more of these images for your own applications.

Svalbard

I already introduced a Svalbard image in 2015 which was and still is the highest quality image of this kind available regarding uniformity, color consistency and lack of clouds. But there always is room for improvement of course. Here is a new image of the same area based on Sentinel-2 data from 2016. The most obvious improvement is the higher resolution of course but this is not the only difference.

Sentinel-2 mosaic of Svalbard

The new image is nearly all from Sentinel-2 data from just one summer season which is somewhat astonishing considering the previous image used three years of Landsat 8 data and still required some Landsat 7 images in addition. The reason for this lies in the somewhat peculiar operation concepts of Sentinel-2. While Landsat operations try to get a nearly uniform 16 day coverage of all land areas and therefore often skips recording opportunities at high latitudes where they are not required for the 16 day interval, Sentinel-2 did not use such a rule last year in Europe leading to a very high recording frequency over Svalbard during summer. See also the coverage maps I showed some time ago. This produced a lot of fairly worthless images since there are lots of clouds in Svalbard, especially in the summer. But it also produced a higher number of images and more complete coverage during the few good weather windows in late summer last year.

This is something I have mixed feelings about. Of course it is nice for the Svalbard area but if you consider how this recording capacity could be used otherwise in areas that are currently only recorded with low priority, in particular Asia and South America, this is ultimately a fairly questionable strategy on a global level. But this is of course a political decision at ESA and there is very little chance that those making it are receptive for global and long term considerations – people do not get in such a position by putting these things first.

Also included in the Svalbard mosaic as a separate image is Bear Island which i did not have in the 2015 image because there was not enough good quality data here at that time.

Sentinel-2 mosaic of Svalbard – Bear Island

Iceland

The other large mosaic is of Iceland. Iceland is among the hardest areas on the northern hemisphere outside the tropics in terms of clouds in satellite images. When I produced 3d views of Iceland previously I heavily relied on Landsat 5 images which I did not use in this mosaic due to their age and low resolution. There is still quite a bit of room for improvements, especially regarding a tighter late summer time frame with a minimum in seasonal snow but it is the first time it was possible to produce an image of this Island within my quality standards.

Landsat Mosaic of Iceland

Jan Mayen and the Faroe Islands

And then there are two more small mosaics of the other Islands in the North Atlantic, Jan Mayen and the Faroe Islands. Jan Mayen is based mostly on Sentinel-2 data while the Faroe Islands are mostly produced from Landsat images.

Sentinel-2 mosaic of Jan Mayen

Landsat mosaic of the Faroe Islands

You can click on the images above to go to the detailed description on services.imagico.de.

EO1_erebus_980

January 20, 2017
by chris
0 comments

Antarctic summer midnight sunrise

As a followup to my recent EO-1 post here another set of unusual images from this satellite from a few months ago.

These kind of show the start of the polar day in the 2016/2017 summer in the Antarctic. At these latitudes (about 77.5 degrees south) most of the year is either dark all day (polar night) or with permanent light (polar day). These images are from the short timespan in between when you actually have a sunrise and sunset. To be precise these actually all depict sunset with respect to the daily move of the sun but as a sequence they illustrate the end of the transit from polar night to polar day.

All these images show Mount Erebus on Ross Island with the Hut Point Peninsula where research stations from New Zealand and the United States are located.

2016-10-20

2016-10-24

2016-10-28

2016-11-01

osmim_2017-01_980

January 5, 2017
by chris
0 comments

Additions to images for mapping

Just added a number of new images to the OSM images for mapping – here a few examples:

First is a Sentinel-2 image of the Central Alps in late September last year. This area is fully covered in high resolution images from other sources but many of them are at least partly not well suited for mapping due to snow or clouds. This image should be useful to update glacier extents in the area. There are also several other images of particular use for glacier mapping like the African glaciers which i featured here recently.

And there is an image of the Kerch Strait between the Sea of Azov and the Black Sea with the new bridge under construction there:

The newest image is of the Pacific side of the Panama Canal – an area which was cloud covered in the older Panama Canal image. This image was taken by the EO-1 satellite just a few days back.

The image was also taken at fairly low tidal water levels so the tidal flats at the coast are well visible.

EO1_980

December 18, 2016
by chris
0 comments

Early in the morning – the last days of EO-1

As most people know satellites generally have a limited life time. Space is a harsh environment, even for machines specifically designed to operate there. Satellites sometimes also fail because of construction and operation mistakes. But the most universal reason why satellites have a limited life span is because they run out of fuel.

If a satellite runs out of fuel its orbit altitude decays and it burns up in the atmosphere. Satellites in low earth orbit still fly in the upper parts of the earth atmosphere which are extremely thin but still produce some drag causing any satellite to gradually slow down and as a result lower its orbit. The International Space Station for example needs to raise its orbit several times per year because of that. Failure to do so would result in the ISS to burn up in the atmosphere within 1-2 years.

For earth observation satellites however this is not what happens when they run out out fuel – at least not initially. These satellites usually fly at a significantly higher altitude than the ISS and even without propulsion they usually remain flying for at least 30-50 years, sometimes significantly longer. How long this takes depends on the orbit altitude, the cross section of the satellite that produces drag relative to its mass and solar activity (which influences upper atmosphere density). The Envisat satellite i mentioned recently for example is expected to remain flying and not burn up in the atmosphere for about 150 years.

What happens with an earth observation satellite when fuel runs out is that it cannot maintain its sun synchronicity any more. And this happens much faster than orbital decay. The sun synchronous orbit of an earth observation satellite means its orbital plane rotates with the same speed as the earth rotates around its own axis but in the opposite direction so it flies with constant orientation of the orbit towards the sun. This happens because of the slightly non-spherical shape of the earth and by careful selection of the orbital parameters to make use of that. But this situation is unstable, there is no natural mechanism that maintains sun synchronicity so the satellite has to make adjustments to maintain this using its engine.

Landsat 7 is expected to run out of fuel next year. Here is a diagram from an USGS presentation illustrating what happens then.

What is shown on the y-axis is the local equator crossing time. As you can see this will move to earlier times quite rapidly and with increasing rate. During the time shown the orbit altitude will likely not change by more than a few kilometers.

There was another satellite in the same orbit as Landsat 7 that ran out of fuel in 2011: Earth Observing-1 or EO-1. I have shown images from EO-1 here on occasion in the past, its recordings are all available as open data just like Landsat imagery. EO-1 was a technology test platform evaluating new technologies for future earth observation satellites some of which have been realized on a larger scale in Landsat 8. EO-1 was started in 2000, about a year after Landsat 7 and originally planned to operate for one year. It is still running today which makes it the satellite most excessively exceeding its design life in history probably – an undead among satellites you could say. It was also – with a 10m resolution panchromatic band – the highest resolution open data satellite until the start of Sentinel-2.

Since EO-1 ran out of fuel more than five years ago it now has an equator crossing time early in the morning creating a fairly unique kind of images not available otherwise. Here an example of Mount Everest and the Rongbuk Glacier:

EO-1

Landsat 8

The EO-1 image on the left is from a day earlier but also more than two hours earlier (about 02:19 UTC compared to 04:42 UTC for Landsat). This view window gives fairly nice lighting conditions – as photographers know mid day light can often be relatively flat and boring while morning and evening situations are more likely to give interesting photo opportunities. Also relief is more articulated under such conditions. Here a few more examples, all of them from the second half of 2016.

Sierra Nevada

Appalachian Mountains

Tordrillo Mountains

Grand Canyon

Teton Range

Canyonlands

An early morning time window also means that at higher latitudes you get a better second late evening window during summer. You have than with Landsat too but much more limited and only available at very high latitudes. Here two examples from EO-1 from this year (from Kamchatka and Iceland).

Kamchatka

Iceland

The EO-1 ALI instrument from which all of the images here are derived pioneered many of the features we now have in Landsat 8 – like the shortwave blue band and the panchromatic band not extending into NIR. Its noise characteristics are not as good as with Landsat 8 – not surprising since it is 10 years older. In particular there is also some quite visible banding in the noise as can be seen in some of the images here. But it is still much better than Landsat 7. And the spectral characteristics (which are included in my satellite comparison chart) are in fact significantly better for true color visualizations than both Landsat 8 and Sentinel-2 due to broader red and green bands. You could actually say in this regard it represents the pinnacle in open data earth observation systems so far. I hope Landsat 10 will tie in with EO-1 in terms of visible band definitions but so far there does not seem to be a particular priority in that direction. It is hard to explain but working with EO-1 ALI colors is generally a real joy while tuning Landsat 8 or Sentinel-2 colors to get consistent, realistic and aesthetically pleasing results is often much more difficult.

Hawaii

Kilimanjaro

Northern Patagonian Ice Field

Coropuna, Peru

EO-1 is now scheduled to be deactivated in February – after nearly 17 years of operations. Despite being relatively lightweight (only about 500kg) it will remain flying at slowly decreasing altitude for many decades – see the following diagram from the report on decommissioning plans.

S3A_980c

December 13, 2016
by chris
0 comments

Sentinel-3 OLCI vs. MODIS/VIIRS – the overall tally

Based on my initial look at the Sentinel-3 data (see part 1, part 2 and part 3 for the details) here some comparitive assessments relative to the existing MODIS/VIIRS systems. As indicated in the title this is for OLCI, i.e. visible light and NIR only since the evaluation of SLSTR is incomplete due to errors in the geolocation data.

This is preliminary of course – based on the currently available data and my limited experience with that data. It covers a number of different fields i consider relevant for choosing a satellite data source. Which of the criteria listed are more important and which less of course depends on the use case.

# Topic Sentinel-3 OLCI MODIS VIIRS
1 Data point quality + o o
2 Data depth + o o
3 Resolution o o -
4 Image archive - ++ o
5 Revisit frequency -/o + ++
6 Temporal Coverage early Morning Morning + Afternoon Afternoon
7 Spatial Coverage - + +
8 Data packaging/usability + + o
9 Ease of data access o + +
10 Higher level products + +
11 License + ++ ++

More detailed notes about the individual categories:

  1. Data point quality refers to the quality of the individual data points (i.e. pixels) in the images, i.e. noise levels and radiometric accuracy. In terms of noise OLCI likely has an advantage over the >15 year old MODIS although as explained this will not really make much practical difference. Also both MODIS and VIIRS show a banding effect in images due to the way their scan system works which does not occur with OLCI.
  2. Data depth means the scope of information available for every point. Sentinel-3 OLCI leads here due to the additional bands. The narrow bands with somewhat suboptimal position w.r.t. visulization purposes reduce this but still a lead compared to MODIS and VIIRS which are not ideal either.
  3. Resolution – as discussed this is more or less a tie between OLCI and MODIS. VIIRS is clearly behind.
  4. Image archive refers to the size of the archive of past recordings that is available for use. Here MODIS has a huge lead of course.
  5. Revisit frequency means how frequently images are recorded at a given location. Apparently even with two satellites OLCI will not be better than MODIS with just a single satellite here. VIIRS leads with full daily coverage with even a single satellite.
  6. Temporal Coverage – is not really a rating category although you could say MODIS is the most versatile here with both morning and afternoon satellites. Since for Sentinel-3 the second satellite will cover the same time slot this will not change with Sentinel-3B.
  7. Spatial Coverage – this refers to what part of the Earth surface can be and is practically covered. Due to lack of south pole coverage Sentinel-3 is behind here. You also need to consider the relatively tight sun elevation cutoff of 10 degrees – limiting spatial or temporal coverage depending on how you look at it.
  8. Data packaging/usability – Both Sentinel-3 and MODIS have pros and cons here, no clear winner. VIIRS is somewhat behind due to relatively poor documentation and complex packaging and processing levels.
  9. Ease of data access – The Sentinel data access is – due to the mandatory use of the API and the need to register (except for the current so called pre-operation phase) – less convenient than the others.
  10. Higher level products – Currently Sentinel-3 data is only available as Level 1 data while the other sensors offer a broad selection of higher level products.
  11. License – all are open data but the vague attribution requirements for Sentinel-3 data are a possible issue for some applications.

And here for wrapping up three more views from Sentinel-3 OLCI – you can use these under CC-BY-SA conditions if you want:

water-reduced-980

December 12, 2016
by chris
0 comments

Reduced waterbody data on openstreetmapdata.com

I am happy to announce that reduced waterbody data for low zoom level rendering of OpenStreetMap based maps is now available on a regular basis on openstreetmapdata.com. This is produced using the methods i recently introduced here. You can find it on the waterbodies datasets page. Make sure you read the dataset description and the process description which contain important advise on how to use this.

These files can be used by anyone under the conditions of the OpenStreetMap license. Users should however consider supporting our service in financial or other form. Additions to the offered data like this as well as future availability of the free services in general depend on this support.

One additional thing to be aware of when using these files is that OpenStreetMap data always contains a significant amount of broken geometries which is sometimes visible in these files in form of missing features. Jochen maintains a page keeping track of broken polygons in OSM in general but we also generate a list with the errors affecting the waterbody rendering specifically. Many of these are just small water area no more significant that any other broken geometry in OSM but there are also quite a few really big polygons causing trouble. Anyone who is interested in improving data quality in OSM might consider working on this list. It is updated daily whenever we also update the waterbody data.

S3A_980b

December 11, 2016
by chris
0 comments

Sentinel-3 – a first look at the data, part 3

This post in the third and likely last part of a report on the recently released Sentinel-3 level 1 data from the OLCI and SLSTR instruments. The first part provided general context, the second part discussed the data distribution form and how this data can be processed and this third part will look more in depth into the quality of the data itself by comparing it to other open data satellite imagery available. Due to the problems with the SLSTR geolocation information explained in the second part this will only cover the OLCI instrument.

On data quality – OLCI

It is generally difficult to judge radiometric accuracy without reliable reference data. Noise characteristics of the visible bands appear to be pretty good – since recordings are limited to relatively high sun angles there is not really a good possibility to properly assess the dynamic range of course.

We remember the spectral bands chart from the first part – in the OLCI instrument there are essentially 10 spectral bands in the visible range, all of them very narrow. The usual convention will likely be to interpret band 3, 6 and 8 as blue, green and red. Bands 1 and 2 are near the blue edge of the visible range, band 7 is also well in the red range and can be used as a red band as well or in combination with band 8. Band 9 and 10 are near the long wavelength end of the visible range. Finally band 4 and 5 are in the blue-green area. The most problematic aspect for visualization purposes is the green band and its location right where red and green sensitivities of the human eye are about the same. Overall such narrow spectral sensitivities are not ideal for accurate color representation but it could be worse.

Resolution wise OLCI offers a 300m resolution at nadir in all bands. For comparison: MODIS and VIIRS provide different spatial resolutions in the red band and the green and blue bands. For MODIS that is 250m for red and 500m for grenn/blue. For VIIRS we get 375m for red and 750m for green/blue. Both also provide the higher resolution for an NIR band (and with VIIRS also for SWIR and thermal bands) so many analytic applications can also make use of higher resolution data.

The nominal resolutions tell only half of the story though. The wider the field of view of the satellite the stronger the dropoff in resolution is towards the edges. This means VIIRS has an additional disadvantage relative to MODIS if you consider the whole image average. The asymmetric view of Sentinel-3 OLCI means despite the more narrow recording swath the maximum viewing angle and therefore the resolution drop-off are comparable to that of MODIS.

Comparing with other satellites

So far this is all theory so here are some comparisons of images from the major open data satellite systems currently operating that provide visible light color images. It is not easy to find image sets where all of these record the same area on the same day, preferably all in the middle of the recording swath. I picked two areas and compromised by using MODIS images from one day earlier than the rest of the images which are all from the same day. The two areas are Lake Tchad and Northern Patagonia:

Images are based on calibrated radiances for all satellites. Normally i would have made this comparison based on reflectance values but apparently the necessary data to calculate reflectance values is currently not available for OLCI. This means differences in recording time and therefore illumination is not compensated for so the images differ also because of that. For reference here the recording times (in UTC) for all images shown.

Sensor Time Tchad Time Patagonia
VIIRS 12:30 19:00
MODIS (Terra) 09:45 14:55
Sentinel-3 08:56 14:14
Sentinel-2 09:38 14:54
Landsat 8 09:25 14:36

As you can see Sentinel-3 is generally the earliest due to the westward view while VIIRS with a noon viewing window is much later than the rest. Here is how the recording footprints look like, you can well see the different image sizes:

And here small area crops for all of these for comparison:

VIIRS Tchad

MODIS Tchad

OLCI Tchad

Sentinel-2 Tchad

Landsat 8 Tchad

VIIRS Patagonia

MODIS Patagonia

OLCI Patagonia

Sentinel-2 Patagonia

Landsat 8 Patagonia

Keep in mind both the differences in recording time and the different date for the MODIS image affect the results. What can still be observed however is:

  • that MODIS and Landsat 8 are fairly close in color calibration.
  • Sentinel-2 seems off relative to that – which i reported earlier. No sure way to say one of them is correct and the other is wrong though.
  • Sentinel-3 OLCI seems somewhere in between those in terms of base brightness but you need to take into account the earlier timeframe here. It does not show the same tint towards blue as the Sentinel-2 data despite the fact that the shortwave blue OLCI band 3 will likely feature significantly stronger atmosphere influence. To me this kind of indicates Sentinel-2 is the outlier here while the rest of the crowd is relatively well synchronized.
  • VIIRS is hard to compare due to the much later recording time but is probably also calibrated in combination with MODIS and Landsat by the USGS.
  • positional accuracy is generally not that great for the lower resolution images compared to the high resolution (Sentinel-2, Landsat) which are both very close so form a suitable reference.
  • resolution of Sentinel-3 OLCI and MODIS is fairly close. This is difficult to compare both due to the variability across the field of view and since MODIS provides higher resolution in the red channel than in the others – there are different possible approaches to use this to produce a high resolution full color image. Which is better also depends on how homogeneous colors are in the location you are looking at. Overall i think OLCI will usually have a slight advantage here overall for color images although for analytics involving mainly red and NIR like NDVI calculation MODIS probably offers measurably higher resolution. In any case the difference is fairly small compared to the difference between MODIS and VIIRS – which is not extremely big either near nadir but overall more significant due to the resolution falloff.
  • in the Patagonia sample area you can also see the advantage of the tilted view of Sentinel-3. MODIS, Sentinel-2 and Landsat all show sun glint on the lakes while OLCI does not. VIIRS also lacks sun glint in this area since the area is significantly off-center in the image to the east while the sun is slightly to the west.

Note for VIIRS and OLCI i mixed spectral bands for better approximation of visual colors.

Outlook

So far we only have the level 1 data. There are plans for higher level products to be made available in 2017 but these are fairly unspecific regarding the underlying methods. It is for example unclear if there will be a surface reflectance product of competitive quality. If we take MODIS as reference here – higher level MODIS products usually come with significant issues and limitations but their easy, uncomplicated and reliable availability makes them an attractive option if you are able to deal with these issues.

West African coast by Sentinel-3 OLCI

December 6, 2016
by chris
0 comments

Mobile cheese

I know this has been a long time coming but today ESA rolled out another change in their Sentinel-2 data distribution form. While the previous change was moving from multiple tile scenes to single tile packages – which i discussed previously resulting in significant performance problems as predicted this new change keeps the content of packages but changes the naming – both of the whole package and the internal file structure.

If you have read my Sentinel-2 data review you might remember that one of the first things i complained about there were the excessively blown up file names full of redundant data and information irrelevant to identify the files. I mentioned this because it is a nuisance when you work with the files but ultimately it is not a big deal – you just rename things into whatever form suits you when you ingest the data into your systems and then don’t have to worry about this any more. The idea that changing this now after more than a year of public data distribution is odd at best. Even more peculiar is the primary reason given for the change

The product naming (including the naming of folders and files inside the product structure) is compacted to overcome the 256 characters limitation on pathnames imposed by Windows platforms

Let me translate that: We now after more than a year of public data distribution change our data distribution form in a not backwards compatible way to cater users of a historic computer platform no more sold or even maintained by its creator that is so outdated that we did not even consider it and its limitations when we initially planned this 3-4 years ago.

Of course you could also simply say: 256 characters ought to be enough for anybody

This is how the change looks like: The old structure had package names like this:

S2A_OPER_PRD_MSIL1C_PDMC_20151230T202002_R008_V20151230T105153_20151230T105153.zip

and within that were data files like this:

S2A_OPER_PRD_MSIL1C_PDMC_20151230T202002_R008_V20151230T105153_20151230T105153.SAFE/GRANULE/S2A_OPER_MSI_L1C_TL_SGS__20151230T162342_A002722_T31TFJ_N02.01/IMG_DATA/S2A_OPER_MSI_L1C_TL_SGS__20151230T162342_A002722_T31TFJ_B01.jp2

Now you get something like:

S2A_MSIL1C_20160914T074612_N0204_R135_T36JTT_20160914T081456.SAFE.zip

and within:

S2A_MSIL1C_20160914T074612_N0204_R135_T36JTT_20160914T081456.SAFE/GRANULE/L1C_T36JTT_A006424_20160914T081456/IMG_DATA/T36JTT_20160914T074612_B01.jp2

This shows just the main data files. The metadata and QA stuff is changed as well, many file names are now generic, that means they are identical for all packages – a bit like with the Sentinel-3 data, just that Sentinel-3 uses lower case file names while Sentinel-2 uses upper case file names.

There are also some quite sensible aspects in the change. For example the MGRS Tile ID is now in the package name. And the timestamps in the package name are in a different order, previously the processing time stamp was first while now the recording time steps is. This for example means when you sort the file names you get them in recording order rather than processing order which makes more sense.

The data distribution system continues to be very unreliable by the way so if you want to take this opportunity to download and look at some Sentinel-2 data you likely need quite a bit of patience.

Addition: The depth of obfuscation in the file format specifications is really impressive by the way. Looking for the actual meaning of the second time stamp in the package file name leads you to three different specifications. In the one that is currently distributed the second time stamp is apparently the datastrip sensing time but there are two other format variants where this is either

  • the package creation date or
  • the newest datastrip sensing time incremented by one second.

You can now really quite visualize what has happened here. Originally the creation date was meant to be used – this is at first mentioned everywhere in the specs. And then someone noticed that when processing the data in parallel the creation date is not necessarily unique…