The main concept of cerbere is that each data acquisition corresponds to a well-known observation or sampling pattern, referred to as data model or feature. The package provides a set of classes implementing each data model, independent from the way the data are stored on disk or into a database (therefore storage format and structure). Storage (or mapper) classes are used to map the content of a file to the corresponding feature (e.g. the content of a grid file to a grid feature).
We assume indeed there is no practical reason why data corresponding to the same sampling pattern (or structure) would be represented differently. Having a set of predefined template for each feature type allows to write for once all usual generic operations such as display, extraction of values, remapping or resampling, saving to the same format conventions,...
Currently managed features include :
- cerbere clearly separates the content from the format:
- content typing : data structure or memory representation of the data,
which should be unique for each feature, described in more details in
Data Model. This corresponds to the
- storage format : the way the above structure is stored/mapped on file,
which can be very different and exotic, described in more details in
Storage. This corresponds to the
- content typing : data structure or memory representation of the data, which should be unique for each feature, described in more details in Data Model. This corresponds to the
Using the mapper package¶
The following section describes the basic operations you can perform with
cerbere to handle Earth Observation data files with
mapper package. This package can be seen as a unified API to
access any data file content, whatever the format. There is one
mapper class per data format.
Reading data from a file¶
If no mapper class exists for a particular format, a new corresponding
mapper class must be written.
The complete list of existing mappers, and their compatibility with known datasets is listed in Built-in mappers.
To read data from a file, first instantiate a mapper object of the corresponding class, specifying the path to this file in the url argument:
>>> import mapper.ncfile >>> ncf = mapper.ncfile.NCFile(url="./test/GW_L2P_ALT_ENVI_GDR_20101210_120905_20101210_125912_097_196.nc")
This does not open the file. The file must be explicitly opened with
A mapper provides a set of methods to inspect the content of a file. They allow to retrieve information from a file in the same way whatever its format.
Get the list of fields in a file (all but the geolocation fields) with get_fieldnames:
Get the dimensions (like in netCDF) of a file (Note that the geolocation dimension names returned are standardized):
Get the dimensions (like in netCDF) of a particular field:
Get the size of a dimension (standard names can be used for geolocation dimensions):
f.get_dimsize('row') # standard dimension name f.get_dimsize('ni') # equivalent native name
Get a field and display it:
field = f.read_field('sea_surface_temperature') print field
- Fields are similar to variables in netcdf. A field consists of :
- an attached variable describing the geophysical quantity provided by the field (together with a few descriptive attributes such standard name, etc...)
- attributes further documenting the provided observation values (units,...) similar to the variable attributes in netCDF
- an array of values (observations)
- an optional array of quality flags (one for each observation value)
- an optional array of quality history (one for each observation value) documenting the reason why a value was flagged
Get the list of field attributes, as a dictionary:
attr = read_field_attributes('sea_surface_temperature') print attr
Get the list of global attributes:
attr = f.read_global_attributes() print attr
Then instantiate a data model class object and load the above mapper object. The data model object is now initialized with the content of the above file :
>>> import datamodel.trajectory >>> traj = datamodel.trajectory.Trajectory() >>> traj.load(ncf)