Writing a mapper

This section describes how to write a mapper that will allow access to a new format in read mode (see at the end of the section complementary information in case you also want to use this mapper to save data).

Creating a new mapper module

Writing a mapper consists in writing a set of function that helps cerbere to understand and access a file content. It must implements the interface defined by :class:AbstractMapper class, and therefore inherit this class.

Create a new module with a basic structure as follow:

"""
.. module::cerbere.mapper.<your mapper module name>

Mapper classs for <the format and/or product type handled by this mapper>

:license: Released under GPL v3 license, see :ref:`license`.

.. sectionauthor:: <your name>
.. codeauthor:: <your name>
"""

# import parent class
from cerbere.mapper import abstractmapper


class <Your mapper class name>(abstractmapper.AbstractMapper):
    """Mapper class to read <the format and/or product type handled by this mapper> files"""

    def __init__(self, url=None, mode=abstractmapper.READ_ONLY, **kwargs):
        """Initialize a <the format and/or product type handled by this mapper> file mapper"""
        super(<Your mapper class name>, self).__init__(url=url, mode=mode, **kwargs)
        return
The following functions of the :class:AbstractMapper have to be overriden:
  • __init__
  • open()
  • close
  • get_matching_dimname
  • get_standard_dimname
  • get_geolocation_field
  • get_fieldnames
  • get_dimensions
  • get_dimsize
  • read_field
  • read_values
  • read_field_attributes
  • read_global_attributes
  • read_global_attribute
  • get_bbox
  • get_spatial_resolution_in_deg
  • get_start_time
  • get_end_time
  • read_fillvalue

open()

The open() method basically opens a file and returns the handler to this file. The following operations must be performed:

  • open the file (usually calling some file opening function)
  • initialize the _handler attribute with the file handler
def open(self, view=None, datamodel=None, datamodel_geolocation_dims=None):
    """Open the file (or any other type of storage)

    Args:
        view (dict, optional): a dictionary where keys are dimension names
            and values are slices. A view can be set on a file, meaning
            that only the subset defined by this view will be accessible.
            This view is expressed as any subset (see :func:`get_values`).
            For example:

            view = {'time':slice(0,0), 'lat':slice(200,300),
            'lon':slice(200,300)}

        datamodel (str): type of feature read or written. Internal argument
            only used by the classes from :mod:`~cerbere.datamodel`
            package. Can be 'Grid', 'Swath', etc...

        datamodel_geolocation_dims (list, optional): list of the name of the
            geolocation dimensions defining the data model to be read in
            the file. Optional argument, only used by the datamodel
            classes, in case the mapper class can store different types of
            data models.

    Returns:
        a handler on the opened file
    """

close()

def close(self):
    """Close handler on storage"""
    raise NotImplementedError

get_matching_dimname()

This method returns the equivalent name in the file format of a standard dimension name (‘x’, ‘y’, ‘z’, ‘lat’, ‘lon’, ‘time’, ‘row’, ‘cell’, ...).

This applies only to self-described formats (such as netcdf, hdf, etc...). For files which are not self-described, you will return the standard name itself.

For an example of self-described format, see :

For an example of binary format, see:

def get_matching_dimname(self, dimname):
    """Return the equivalent name in the native format for a standard
    dimension.

    This is a translation of the standard names to native ones. It is used
    for internal purpose only and should not be called directly.

    The standard dimension names are:

    * x, y, time for :class:`~cerbere.datamodel.grid.Grid`
    * row, cell, time for :class:`~cerbere.datamodel.swath.Swath` or
      :class:`~cerbere.datamodel.image.Image`

    To be derived when creating an inherited data mapper class. This is
    mandatory for geolocation dimensions which must be standard.

    Args:
        dimname (str): standard dimension name.

    Returns:
        str: return the native name for the dimension. Return `dimname` if
            the input dimension has no standard name.

    See Also:
        see :func:`get_standard_dimname` for the reverse operation
    """

get_standard_dimname()

This method returns the equivalent standard name of a dimension in the file format. This is the reverse operation of get_matching_dimname().

This applies only to self-described formats (such as netcdf, hdf, etc...). For files which are not self-described, you will return the standard name itself.

For an example of self-described format, see :

For an example of binary format, see:

def get_standard_dimname(self, dimname):
    """
    Returns the equivalent standard dimension name for a
    dimension in the native format.

    This is a translation of the native names to standard ones. It is used
    for internal purpose and should not be called directly.

    To be derived when creating an inherited data mapper class. This is
    mandatory for geolocation dimensions which must be standard.

    Args:
        dimname (string): native dimension name

    Return:
        str: the (translated) standard name for the dimension. Return
        `dimname` if the input dimension has no standard name.

    See Also:
        see :func:`get_matching_dimname` for the reverse operation
    """

get_geolocation_field()

This method returns the equivalent standard name of a geolocation (or coordinate)field in the file format.

This applies only to self-described formats (such as netcdf, hdf, etc...). For files which are not self-described, you will return the standard name itself.

For an example of self-described format, see :

For an example of binary format, see:

def get_geolocation_field(self, fieldname):
    """Return the equivalent field name in the file format for a standard
    geolocation field (lat, lon, time, z).

    Used for internal purpose and should not be called directly.

    Args:
        fieldname (str): name of the standard geolocation field (lat, lon
            or time)

    Return:
        str: name of the corresponding field in the native file format.
            Returns None if no matching is found
    """

get_fieldnames()

This methods returns the list of fields in the file. The coordinate fields are excluded from this list.

def get_fieldnames(self):
    """Returns the list of geophysical fields stored for the feature.

    The geolocation field names are excluded from this list.

    Returns:
        list<string>: list of field names
    """

get_dimensions()

This method returns the dimensions of a particular field (if a field name is passed as argument) or of the whole file, as a tuple. The dimension names must be returned using the standard names.

def get_dimensions(self, fieldname=None):
    """Return the dimension's standard names of a file or a field in the
    file.

    Args:
        fieldname (str): the name of the field from which to get the
            dimensions. For a geolocation field, use the cerbere standard
            name (time, lat, lon), though native field name will work too.

    Returns:
        tuple<str>: the standard dimensions of the field or file.
    """

get_dimsize()

Returns the length of a dimension as an integer.

def get_dimsize(self, dimname):
    """Return the size of a dimension.

    Args:
        dimname (str): name of the dimension.

    Returns:
        int: size of the dimension.
    """

read_field()

This method return the description (metadata) of a field as a Field object.

def read_field(self, fieldname):
    """
    Return the :class:`cerbere.field.Field` object corresponding to
    the requested fieldname.

    The :class:`cerbere.field.Field` class contains all the metadata
    describing a field (equivalent to a variable in netCDF).

    Args:
        fieldname (str): name of the field

    Returns:
        :class:`cerbere.field.Field`: the corresponding field object
    """

read_field_attributes()

def read_field_attributes(self, fieldname):
    """Return the specific attributes of a field.

    Args:
        fieldname (str): name of the field.

    Returns:
        dict<string, string or number or datetime>: a dictionary where keys
            are the attribute names.
    """
    return {}

read_fillvalue()

def read_fillvalue(self, fieldname):
    """Read the fill value of a field.

    Args:
        fieldname (str): name of the field.

    Returns:
        number or char or str: fill value of the field. The type is the
            as the type of the data in the field.
    """

read_global_attributes()

def read_global_attributes(self):
    """Returns the names of the global attributes.

    Returns:
        list<str>: the list of the attribute names.
    """
    return {}

read_global_attribute()

def read_global_attribute(self, name):
    """Returns the value of a global attribute.

    Args:
        name (str): name of the global attribute.

    Returns:
        str, number or datetime: value of the corresponding attribute.
    """

read_values()

This method reads the data of a field. This is the most tricky method to override because of cerbere slice and view system.

def read_values(self, fieldname, slices=None, **kwargs):
    """Read the data of a field.

    Args:
        fieldname (str): name of the field which to read the data from

        slices (list of slice, optional): list of slices for the field if
            subsetting is requested. A slice must then be provided for each
            field dimension. The slices are relative to the opened view (see
            :func:open) if a view was set when opening the file.

    Return:
        MaskedArray: array of data read. Array type is the same as the
            storage type.
    """

get_start_time()

def get_start_time(self):
    """Returns the minimum date of the file temporal coverage.

    Returns:
        datetime: start time of the data in file.
    """

get_end_time()

def get_end_time(self):
    """Returns the maximum date of the file temporal coverage.

    Returns:
        datetime: end time of the data in file.
    """

def get_bbox(self):
    """Returns the bounding box of the feature, as a tuple.

    Returns:
        tuple: bbox expressed as (lonmin, latmin, lonmax, latmax)
    """

Saving data with a mapper

If the file can be also opened in write mode:
create_dim create_field write_field write_global_attributes

If you don’t plan to write in this mapper format, just copy the following block in your class source code:

def create_field(self, field, dim_translation=None):
    """Creates a new field in the mapper.

    Creates the field structure but don't write yet its values array.

    Args:
        field (Field): the field to be created.

    See also:
        :func:`write_field` for writing the values array.
    """
    raise NotImplementedError

def create_dim(self, dimname, size=None):
    """Add a new dimension.

    Args:
        dimname (str): name of the dimension.
        size (int): size of the dimension (unlimited if None)
    """
    raise NotImplementedError

def write_field(self, fieldname):
    """Writes the field data on disk.

    Args:
        fieldname (str): name of the field to write.
    """
    raise NotImplementedError

def write_global_attributes(self, attrs):
    """Write the global attributes of the file.

    Args:
        attrs (dict<string, string or number or datetime>): a dictionary
            containing the attributes names and values to be written.
    """
    raise NotImplementedError