[CF-metadata] some concerns about the "ensemble axis" proposal

Francisco Doblas-Reyes Francisco.Doblas-Reyes at ecmwf.int
Mon Feb 26 07:46:30 MST 2007


Dear Steve and Balaji,

Thanks for bringing back the issue of CF and the ensemble axis.

>    1. The ensemble axis proposal does not solve the general multi-model
>       ensemble problem.
>           * In earlier discussions it was stated that the ensemble axis
>             is intended to address multi-model ensembles.  But the model
>             runs in a multi-model ensemble will in general be on a
>             number of differing grids.  The proposal addresses only the
>             limited sub-case in which all models have been re-gridded to
>             the same grid.  This leaves unanswered how CF can support
>             ensembles on multiple grids.  We should explore the answer
>             to the general question before committing to the specialized
>             solution.
The original proposal concerned multi-forecast system ensembles. This 
includes initial-condition ensembles, perturbed-parameter ensembles and 
multi-models. It is likely that the first two systems have the same grid 
in all the forecasts, because they would be generated by the same model 
version. I wouldn't call these examples a limited sub-case.
Solving the question of how to handle multiple grids in the same file 
before introducing the ensemble dimension would be ideal, but in the 
meantime the dissemination of standard NetCDF files with all sorts of 
ensembles forecasts is limited.

>    2. A netCDF-style ensemble axis is a marginal model for the
>       underlying problem.
>           * The "ensemble axis" is not an ordered axis.  So when clients
>             are working with models from an ensemble they will often not
>             be accessing contiguous ranges of indices on the axis.
>             NetCDF dimensions are ordered and can only provide direct
>             API support for contiguous ranges on a dimension.   So the
>             ensemble axis proposal will not provide the usual and
>             expected benefits of a netCDF dimension.
I agree that the treatment of the ensemble dimension is far from 
perfect. However, I don't understand why dimensions can only provide 
direct support for contiguous ranges. In a NetCDF file with various 
(deterministic) forecasts, the variables "forecast_reference_time" and 
"forecast_period" can be used in CF to determine the verifying time of 
the forecast. Although these variables are referenced (at least one of 
them with respect to a calendar), they don't need to be continuous in 
range as forecasts can also be unequally spaced in time.

>    4. In realistic data management scenarios the ensemble axis will not
>       be a sufficient solution to the problem;  "aggregation servers"
>       will be needed as well.  (And when aggregation servers are
>       introduced into the problem space, there are alternative
>       approaches that should be considered, too.)
We have created an example of aggregation server that contains 
multi-model ensemble data:
http://ensembles.ecmwf.int/thredds/catalog.html
This is an effort to satisfy the need of the community to access 
forecast data in an efficient way (of course, we can rewrite the files 
once a consensus has been reached, be it a modified version of the 
"ensembles" proposal or the use NetCDF4). Although the files are big, 
the system seems to cope fine with them. Forecasts from additional 
models might be added in the near future, but as they will be identified 
with the variables "source", "institution", "experiment_id" and 
"realization" (some of them are not considered as global attributes any 
more), there shouldn't be any problem in being social. However, this 
dataset has been created with the individual model outputs taking into 
account the other contributors, which, as you point out, won't be the 
general case.

>    5. If implemented the proposal will create significant barriers to
>       interoperability.
>           * CF1.0 has created the highest level of model-sharing
>             interoperability that our community has ever seen. 
>             Interoperability is arguably the greatest contribution that
>             CF has made.  (It is for this reason, for example, that ESRI
>             products have begun to support CF.)  Many clients that are
>             currently capable of reading CF 1.0 will not be able to
>             access model outputs that utilize this proposal.  The scope
>             of this problem -- weighing the benefits against the losses
>             -- deserves to be discussed and assessed.
You're right. The use of a fifth dimension prevents Grads and Ferret 
(among others) from handling the files. However, to my knowledge, the 
inability of those clients to work with a fifth dimension made the users 
of ensembles forecasts to not use them and search for alternatives such 
as R, IDL or MATLAB.

>    6. The proposal potentially compromises the future quality of CF
>       because netCDF 4 will offer solutions that model the problem properly.
When is netCDF4 expected to be available to start writing ensemble 
forecast files?

My message is that there is a demand for standardized NetCDF ensemble 
forecast files. Shouldn't the adoption of a CF standard to write those 
files depend on when an alternative, more adequate solution is available?

Best regards,
Paco
-- 
________________________________________

Francisco J. Doblas-Reyes
European Centre for Medium-Range
Weather Forecasting (ECMWF)
Shinfield Park, RG2 9AX
Reading, UK

Tel: +44 (0)118 9499 655
Fax: +44 (0)118 9869 450
f.doblas-reyes at ecmwf.int
_______________________________________


More information about the CF-metadata mailing list