Changes
On July 12, 2021 at 4:02:59 PM UTC, Administrator:
-
Updated description of Cloud Optimized Raster Encoding (CORE) format from
The specification and corresponding reproducible dataset is provided in support of a submitted paper in Geomatics. __DISCLAIMER__: This is an early "public request for comments" (pre-alpha release) version. Interested parties are warmly invited to join development, to comment, discuss, find bugs, etc. __Acknowledgement:__ The CORE format was proudly inspired by the Cloud Optimized GeoTIFF (COG) format (https://www.cogeo.org/), by considering how to leverage the ability of clients issuing HTTP GET range requests for a time-series of remote sensing and aerial imagery (instead of just one image). CORE is not a competitor for COG. __License:__ The Cloud Optimized Raster Encoding (CORE) specifications are released to the public domain under a Creative Commons 4.0 CC0 "No Rights Reserved" international license. You can reuse the information contained herein in any way you want, for any purposes and without restrictions. ----------------------- __Summary:__ The Cloud Optimized Raster Encoding (CORE) format is being developed for the efficient storage and management of gridded data by applying video encoding algorithms. It is mainly designed for the exchange and preservation of large time series data in environmental data repositories, while in the same time enabling more efficient workflows on the cloud. It can be applied to any large number of similar (in pixel size and image dimensions) raster data layers. CORE is not designed to replace COG but to work together with COG for a collection of many layers (e.g. by offering a fast preview of layers when switching between layers of a time series). __WARNING__: Currently only applicable to RGB/Byte imagery. The final CORE specifications may probably be very different from what is written herein or CORE may not ever become productive due to a myriad of reasons (see also 'Major issues to be solved'). With this early public sharing of the format we explicitly support the Open Science agenda, which implies __"shifting from the standard practices of publishing research results in scientific publications towards sharing and using all available knowledge at an earlier stage in the research process"__ (quote from: European Commission, Directorate General for Research and Innovation, 2016. Open innovation, open science, open to the world). __CORE Specifications:__ 1) a MP4 or WebM video digital multimedia container format (or any future video container playable as HTML video in major browsers) 2) a free to use or open video compression codec such as H.264, VP9, or AV1 (or any future video codec that is open sourced or free to use for end users) Note: H.264 is currently recommended because of the wide usage with support in all major browsers, fast encoding due to acceleration in hardware (which is currently not the case for AV1 or VP9) and the fact that MPEG LA has allowed the free use for streaming video that is free to the end users. However, please note that H.264 is restricted by patents and its use in proprietary or commercial software requires the payment of royalties to MPEG LA (https://www.mpegla.com/programs/avc-h-264/). However, when AV1 matures and accelerated hardware encoding becomes available, AV1 is expected to offer 30% to 50% smaller file size in comparison with H.264, while retaining the same quality (see https://trac.ffmpeg.org/wiki/Encode/AV1). 3) the encoding frame rate should be of one frame per second (fps) with each layer segmented in internal tiles, similar to COG, ordered by the main use case when accessing the data: either layer contiguous or tile contiguous; Note: The internal tile arrangement should support an easy navigation inside the CORE video format, depending on the use case. 4) a CORE file is optimised for streaming with the moov atom at the beginning of the file (e.g. with -movflags faststart) and optional additional optimisations depending on the codec used (e.g. -tune fastdecode -tune zerolatency for H.264) 5) metadata tags inside the moov atom for describing and using geographic image data (that are preferably compatible with the OGC GeoTIFF standard https://www.ogc.org/standards/geotiff or any future standard accepted by the geospatial community) as well as list of original file names corresponding to each CORE layer 6) it needs to encode similar source rasters (such as time series of rasters with the same extent and resolution, or different tiles of the same product; each input raster should be having the pixel size 7) it provides a mechanism for addressing and requesting overviews (lower resolution data) for a fast display in web browser depending on the map scale (currently external overviews) __Major issues to be solved:__ - Overviews and internal tiles encoding (similar to COG, chaining lower resolution videos in the same MP4 container for fast access to overviews first, maximal number of frames per second for supporting internal tiling inside one video second); for the moment overviews are kept separate. - Metadata encoding (how to best encode spatial extent, layer names, and so on, for each of the layer inside the series, which may have a different geographical extent, etc...; Known issues: adding too many tags with FFmpeg - which are not part of the standard MP4 moov atom - seem to be ignored; metadata tags have a limited string length; ) - Applicability beyond RGB/Byte datasets; defining a standard way of converting cell values from Int16/UInt16/UInt32/Int32/Float32/Float64/ data types into multi-band Byte values (and reconstructing them back to the original data type within acceptable thresholds) __Example__ __Notice__: The provided CORE (.mp4) examples contain modified Copernicus Sentinel data [2018-2021]. For generating the CORE examples provided, 50 original Sentinel 2 (S-2) TCI data images from an area located inside Switzerland were downloaded from www.copernicus.eu, and then transformed into CORE format using ffmpeg with H.264 encoding using the x264 library (https://www.videolan.org/developers/x264.html). For full reproducibility, we provide the original data set and results, as well scripts for data encoding and extraction (see resources).
toThe specification and corresponding reproducible dataset is provided in support of a submitted paper in Geomatics. __DISCLAIMER__: This is an early "public request for comments" (pre-alpha release) version. Interested parties are warmly invited to join development, to comment, discuss, find bugs, etc. __Acknowledgement:__ The CORE format was proudly inspired by the Cloud Optimized GeoTIFF (COG) format (https://www.cogeo.org/), by considering how to leverage the ability of clients issuing HTTP GET range requests for a time-series of remote sensing and aerial imagery (instead of just one image). CORE is not a competitor for COG. __License:__ The Cloud Optimized Raster Encoding (CORE) specifications are released to the public domain under a Creative Commons 4.0 CC0 "No Rights Reserved" international license. You can reuse the information contained herein in any way you want, for any purposes and without restrictions. ----------------------- __Summary:__ The Cloud Optimized Raster Encoding (CORE) format is being developed for the efficient storage and management of gridded data by applying video encoding algorithms. It is mainly designed for the exchange and preservation of large time series data in environmental data repositories, while in the same time enabling more efficient workflows on the cloud. It can be applied to any large number of similar (in pixel size and image dimensions) raster data layers. CORE is not designed to replace COG but to work together with COG for a collection of many layers (e.g. by offering a fast preview of layers when switching between layers of a time series). __WARNING__: Currently only applicable to RGB/Byte imagery. The final CORE specifications may probably be very different from what is written herein or CORE may not ever become productive due to a myriad of reasons (see also 'Major issues to be solved'). With this early public sharing of the format we explicitly support the Open Science agenda, which implies __"shifting from the standard practices of publishing research results in scientific publications towards sharing and using all available knowledge at an earlier stage in the research process"__ (quote from: European Commission, Directorate General for Research and Innovation, 2016. Open innovation, open science, open to the world). __CORE Specifications:__ 1) a MP4 or WebM video digital multimedia container format (or any future video container playable as HTML video in major browsers) 2) a free to use or open video compression codec such as H.264, VP9, or AV1 (or any future video codec that is open sourced or free to use for end users) Note: H.264 is currently recommended because of the wide usage with support in all major browsers, fast encoding due to acceleration in hardware (which is currently not the case for AV1 or VP9) and the fact that MPEG LA has allowed the free use for streaming video that is free to the end users. However, please note that H.264 is restricted by patents and its use in proprietary or commercial software requires the payment of royalties to MPEG LA (https://www.mpegla.com/programs/avc-h-264/). However, when AV1 matures and accelerated hardware encoding becomes available, AV1 is expected to offer 30% to 50% smaller file size in comparison with H.264, while retaining the same quality (see https://trac.ffmpeg.org/wiki/Encode/AV1). 3) the encoding frame rate should be of one frame per second (fps) with each layer segmented in internal tiles, similar to COG, ordered by the main use case when accessing the data: either layer contiguous or tile contiguous; Note: The internal tile arrangement should support an easy navigation inside the CORE video format, depending on the use case. 4) a CORE file is optimised for streaming with the moov atom at the beginning of the file (e.g. with -movflags faststart) and optional additional optimisations depending on the codec used (e.g. -tune fastdecode -tune zerolatency for H.264) 5) metadata tags inside the moov atom for describing and using geographic image data (that are preferably compatible with the OGC GeoTIFF standard https://www.ogc.org/standards/geotiff or any future standard accepted by the geospatial community) as well as list of original file names corresponding to each CORE layer 6) it needs to encode similar source rasters (such as time series of rasters with the same extent and resolution, or different tiles of the same product; each input raster should be having the same image and pixel size) 7) it provides a mechanism for addressing and requesting overviews (lower resolution data) for a fast display in web browser depending on the map scale (currently external overviews) __Major issues to be solved:__ - Overviews and internal tiles encoding (similar to COG, chaining lower resolution videos in the same MP4 container for fast access to overviews first, maximal number of frames per second for supporting internal tiling inside one video second); for the moment overviews are kept separate. - Metadata encoding (how to best encode spatial extent, layer names, and so on, for each of the layer inside the series, which may have a different geographical extent, etc...; Known issues: adding too many tags with FFmpeg - which are not part of the standard MP4 moov atom - seem to be ignored; metadata tags have a limited string length; ) - Applicability beyond RGB/Byte datasets; defining a standard way of converting cell values from Int16/UInt16/UInt32/Int32/Float32/Float64/ data types into multi-band Byte values (and reconstructing them back to the original data type within acceptable thresholds) __Example__ __Notice__: The provided CORE (.mp4) examples contain modified Copernicus Sentinel data [2018-2021]. For generating the CORE examples provided, 50 original Sentinel 2 (S-2) TCI data images from an area located inside Switzerland were downloaded from www.copernicus.eu, and then transformed into CORE format using ffmpeg with H.264 encoding using the x264 library (https://www.videolan.org/developers/x264.html). For full reproducibility, we provide the original data set and results, as well scripts for data encoding and extraction (see resources).