[CF-metadata] some concerns about the "ensemble axis" proposal

Jennifer Adams jma at cola.iges.org
Wed Mar 7 12:55:32 MST 2007


This discussion is getting juicy!

I am the GrADS and GDS developer working on an interface for 5- 
dimensional data sets. Ensembles are one example of how the 5th  
dimension might be used, but there are others (e.g. EOFs), so we are  
trying to make it as general as possible while still being practical  
and usable. GrADS is written in C and handles data in a variety of  
formats. Data file aggregation over time, and now over the "e"  
dimension, is possible but not required.

Currently, we are not building an interface for multi-model ensembles  
on different grids. The elephant in Steve's living room will not  
allowed to play in our yard. Fast and easy interpolation between data  
sets on different grids was omitted from GrADS by design and that is  
not likely to change with the addition of a new grid dimension. If  
users want to lump data sets on different grids together, they must  
handle the interpolation explicitly in a way that is best suited to  
their needs and in a way that they know will best preserve the  
information in the data they wish to extract.

Ensembles that are on the same grid will be handled by GrADS. For  
metadata, we are taking a minimalist approach -- the ensemble axis is  
linear, and members have a unique name (<16 characters) and are  
numbered from 1 to n. We don't require that all members have the same  
start time or length, so those pieces of metadata are also required.  
This information is generally provided in a data descriptor file, an  
external metadata source written by the user after poring over the  
output from ncdump or wgrib or similar routine.

If I am handed a single netcdf file with multi-model-different-grid  
ensembles in it from ECMWF or GFDL, I'm going to write a set of  
descriptor files, each one describing the subset of variables on a  
common 5D grid. I'll have one descriptor file per grid, all pointing  
at the same data file. Now I'm set to do my analysis in GrADS,  
beginning with careful interpolation between the different grids.

When I put my 5D data sets behind a GDS and serve them to the world  
of OPeNDAP clients (including GrADS), it becomes a special case:  a  
5D netcdf file that doesn't require a descriptor file, a file that  
has all the metadata GrADS needs packaged in just the right way. For  
the time being, my approach works because I'm writing the code for  
the client and the server, I'm not worrying about any other client  
trying to read my 5D GDS data set, and I'm not trying to be CF- 
compliant. Here's what it's going to look like:

dimensions:
         lon = 9 ;
         lat = 9 ;
         lev = 9 ;
         time = 9 ;
         ens = 9 ;
         string16 = 16 ;
variables:
         float lon(lon) ;
                 lon:units = "degrees_east" ;
         float lat(lat) ;
                 lat:units = "degrees_north" ;
         float lev(lev) ;
                 lev:units = "level" ;
         float time(time) ;
                 time:units = "days since 0001-01-01 00:00:00" ;
         float ens(ens) ;
                 ens:grads_dim = "e" ;
         char ens_name(ens, string16) ;
                 ens_name:long_name = "ensemble name" ;
         int ens_length(ens) ;
                 ens_length:long_name = "ensemble length" ;
         int ens_tinit(ens) ;
                 ens_tinit:long_name = "ensemble initial time index" ;
         float var(ens, time, lev, lat, lon) ;
                 var:long_name = "test variable" ;

When more metadata is required to bring my GDS data set into CF  
compliance, or to make it readable by other open source clients, I'll  
add it. As long as GrADS users have the means to keep up with the  
data sets being generated by CFS, IPCC, TIGGE, or whatever, then I'm  
not concerned.

It took me a long time to write out this email -- I lost most of an  
afternoon trying to phrase everything properly. I have been reading  
this thread with interest, but I just can't keep up this kind of  
lengthy correspondence on a regular basis. Please keep me in mind as  
one of the silent listeners who still cares about the outcome.

Jennifer

--
Jennifer M. Adams
IGES/COLA
4041 Powder Mill Road, Suite 302
Calverton, MD 20705
jma at cola.iges.org



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.cgd.ucar.edu/pipermail/cf-metadata/attachments/20070307/0bb198a6/attachment-0001.html 


More information about the CF-metadata mailing list