=============== Format profiles =============== :mod:`cerbere` was designed to ease data management tasks. It can be used to convert data files from one format to another, or to reformat data to another formatting convention matching a specific project requirements, with minimum code effort. For some examples of such operation, refer also to :ref:`format_profile`. When saving the content of a :mod:`~cerbere.dataset` class object, the output file is formatted following the default settings and conventions implemented in the :func:`~cerbere.dataset.dataset.Dataset.save` method of this class. The format can also be refined and customized through a external format profile file that can be passed on when saving a dataset. It provides the directives to properly format a dataset, using some convention or default settings. In particular, it can define: * the list of global metadata attributes (and default value) * the list of field metadata attributes (and default value) such as units, standard name, comment, description, reference,... * the encoding parameters used when writing the data on disk, such as (for a NetCDF writer) scale factor, add offset, number of significant digits, compression,... Let's format for instance data to GHRSST format (as defined in GDS 2.1 document). We define these requirements in a profile file as follow: .. code-block:: yaml --- # Defines the list and default values of the global attributes of a Cerbere new feature attributes: # Description id: naming_authority: org.ghrsst title: summary: cdm_data_type: keywords: Oceans > Ocean Temperature > Sea Surface Temperature acknowledgement: "Please acknowledge the use of these data with the following statement: these data were produced by the Centre de Recherche et d'Exploitation Satellitaire (CERSAT), at IFREMER, Plouzane (France)" processing_level: metadata_link: comment: file_quality_level: # Observation platform: platform_type: instrument: instrument_type: band: # Conventions Conventions: CF 1.7, ACDD 1.3, ISO 8601 Metadata_Conventions: Climate and Forecast (CF) 1.7, Attribute Convention for Data Discovery (ACDD) 1.3 standard_name_vocabulary: NetCDF Climate and Forecast (CF) Metadata Convention keywords_vocabulary: NASA Global Change Master Directory (GCMD) Science Keywords format_version: GDSv1.2 gds_version_id: platform_vocabulary: CEOS mission table instrument_vocabulary: CEOS instrument table # Authorship institution: Institut Francais de Recherche et d'Exploitation de la Mer (Ifremer) Centre de Recherche et d'Exploitation satellitaire (CERSAT) institution_abbreviation: Ifremer/CERSAT project: Group for High Resolution Sea Surface Temperature (GHRSST) program: CMEMS license: GHRSST protocol describes data use as free and open. publisher_name: CERSAT publisher_url: http://cersat.ifremer.fr publisher_email: cersat@ifremer.fr publisher_institution: Ifremer publisher_type: institution creator_name: CERSAT creator_url: http://cersat.ifremer.fr creator_email: cersat@ifremer.fr creator_type: institution creator_institution: Ifremer contributor_name: contributor_role: references: # Traceability processing_software: Telemachus 1.0 product_version: 3.0 netcdf_version_id: uuid: history: source: source_version: date_created: date_modified: date_issued: date_metadata_modified: # BBox geospatial_lat_min: geospatial_lat_max: geospatial_lat_units: degrees geospatial_lon_min: geospatial_lon_max: geospatial_lon_units: degrees geospatial_bounds: geospatial_bounds_crs: WGS84 # Resolution spatial_resolution: geospatial_lat_resolution: geospatial_lon_resolution: # Temporal time_coverage_start: time_coverage_end: time_coverage_resolution: fields: lat: standard_name: latitude units: degrees_north valid_range: -90, 90 comment: geographical coordinates, WGS84 projection coordinates: lon lat lon: standard_name: longitude units: degrees_east valid_range: -180., 180 comment: geographical coordinates, WGS84 projection time: long_name: reference time of sst file standard_name: time sea_surface_temperature: long_name: sea surface foundation temperature standard_name: sea_surface_foundation_temperature units: kelvin valid_range: -2., 50. sst_dtime: long_name: time difference from reference time units: seconds valid_range: -86400, 86400 comment: time plus sst_dtime gives each measurement time solar_zenith_angle: long_name: solar zenith angle units: angular_degree valid_range: 0, 180 comment: the solar zenith angle at the time of the SST observations sses_bias: long_name: SSES bias estimate units: kelvin valid_range: -2.54, 2.54 comment: Bias estimate derived using the techniques described at http://www.ghrsst.org/SSES-Description-of-schemes.html sses_standard_deviation: long_name: SSES standard deviation valid_range: 0., 2.54 comment: Standard deviation estimate derived using the techniques described at http://www.ghrsst.org/SSES-Description-of-schemes.html quality_level: long_name: quality level of SST pixel valid_range: 0, 5 flag_meanings: no_data bad_data worst_quality low_quality acceptable_quality best_quality flag_values: 0, 1, 2, 3, 4, 5 comment: These are the overall quality indicators and are used for all GHRSST SSTs or_latitude: units: degrees_north valid_range: -80., 80 long_name: original latitude of the SST value standard_name: latitude or_longitude: units: degrees_east valid_range: -180., 180. long_name: original longitude of the SST value standard_name: longitude or_number_of_pixels: long_name: original number of pixels from the L2Ps contributing to the SST value valid_range: -32767, 32767 satellite_zenith_angle: long_name: satellite zenith angle units: angular_degree comment: the satellite zenith angle at the time of the SST observations valid_min: 0 valid_max: 90 adjusted_sea_surface_temperature: long_name: adjusted collated sea surface temperature standard_name: sea_surface_subskin_temperature units: kelvin comment: bias correction using a multi-sensor reference field valid_min: -300 valid_max: 4500 encoding: lat: dtype: float32 least_significant_digit: 3 lon: dtype: float32 least_significant_digit: 3 sea_surface_temperature: dtype: int16 _FillValue: -32768 scale_factor: 0.01 add_offset: 273.15 sst_dtime: _FillValue: -2147483648 add_offset: 0 scale_factor: 1 dtype: int32 solar_zenith_angle: _FillValue: -128 add_offset: 90. scale_factor: 1. quality_level: _FillValue: -128 dtype: byte sses_bias: _FillValue: -128 dtype: byte add_offset: 0. scale_factor: 0.02 sses_standard_deviation: _FillValue: -128 dtype: byte add_offset: 2.54 scale_factor: 0.02 or_latitude: dtype: int16 _FillValue: -32768 add_offset: 0. scale_factor: 0.01 units: degrees_north or_longitude: dtype: int16 _FillValue: -32768 add_offset: 0. scale_factor: 0.01 or_number_of_pixels: dtype: byte _FillValue: -32768 add_offset: 0 scale_factor: 1 satellite_zenith_angle: dtype: byte _FillValue: -128 add_offset: 0. scale_factor: 1. adjusted_sea_surface_temperature: dtype: int16 _FillValue: -32768 add_offset: 273.15 scale_factor: 0.01 This profile file can be passed on to the netCDF dedicated dataset object, provided by :class:`cerbere.dataset.ncdataset.NCDataset` class when saving the object content to disk: .. code-block:: python # create a dataset object from cerbere.dataset.ncdataset import NCDataset dst = NCDataset() # save it in a NetCDF file, using above profile and NCDataset class dst.save('test.nc') Note that the attributes already defined in the dataset object are not overridden by the default values in the profile file. Attributes not defined in the dataset or feature object to be saved will fall back to their default value defined in the format profile. .. code-block:: python # create a NCDataset dataset object and fill in some attributes # save it, using above profile