[CF-metadata] CF and multi-forecast system ensemble data

Jennifer M. Adams jma at cola.iges.org
Thu Nov 2 11:53:37 MST 2006


On Nov 2, 2006, at 12:05 PM, Francisco Doblas-Reyes wrote:

> Hi Jennifer,
>
> I'm afraid there are a few things I don't understand from your  
> message. Here are my questions/answers:
>
>> 1. The absolute time axis ("time") has to span all the ensemble  
>> members -- thus it should have 18 time steps, beginning 1feb2000,  
>> incremented by 1 month. I know this leads to a lot of missing  
>> data, but in the GrADS/GDS environment, time is incompressible and  
>> the 18 members must all fit into the 5D grid. I noted that you  
>> didn't actually provide any axis values for your "time" dimension,  
>> but if that is the dimension for your data variable, it ought  to  
>> be explicitly defined.
>
> I used the way recommended in CF to encode forecasts. It uses a  
> single dimension for time ("time" in the file) with the variable  
> with standard_name "forecast_reference_time" indicating the  
> forecast start date and the variable with standard_name  
> "forecast_period" indicating the time when the forecast verifies.  
> Both variables are referenced to an exact date.
Perhaps the confusion arises because GDS doesn't seem to follow CF  
recommendations for encoding forecasts. (Uh oh...) GrADS and GDS  
treat forecasts like ordinary time series data -- An example of a GFS  
forecast (no ensemble dim) behind GDS is here:
    http://monsoondata.org:9090/dods/gfs/gfs.2006110212
Try a 'dncdump' on it and see what it looks like. There's nothing in  
there about forecast_reference_time or forecast_period -- it's just a  
time series with 31 6-hourly time steps beginning on 12z2nov2006 and  
ending on 00z10nov2006.

> What do you mean by "you didn't actually provide any axis values  
> for your "time" dimension"? Does the time dimension need to be a  
> variable too?
Yes, as far as GrADS is concerned. GrADS looks for metadata  
associated with the coordinate variables listed for each data  
variable, i.e.:
         float geopotential(ensemble, time, level, latitude,  
longitude) ;
GrADS would never look for the variables with standard_name  
"forecast_reference_time" or "forecast_period".

GrADS might have understood what you intended if you had put
         float geopotential(ensemble, leadtime, level, latitude,  
longitude) ;
but then it would have got the time values wrong because of the  
discontinuity between time 6 and time 7 (see next inline comment.)

> I thought dimension names can be independent from the names of the  
> variables that use them.
>
> When you propose that the ensemble members should have 18 time  
> steps, I don't really understand what you mean.
Individual members can have any length. But in order to lump members  
of different length into a single 5D data set, GrADS defines a single  
linear time axis that spans all the members. Your first 8 members,  
with experiment_id "scwf" are 6-month forecasts starting 1Feb2000.  
Your second 8 members ("ukmo") are 6-month forecasts starting a year  
later, 1Feb2001. For a 5D grid that contains all the members, GrADS  
won't let you simply concatenate those two 6-month periods into a 12- 
step time axis, the 6 months in between (Aug2000-Jan2001) must also  
be included because GrADS requires a linear time axis. Thus, the  
single linear time axis starts 1Feb2000 and ends 1Jul2001 -- 18 months.

>
>> 2. The variable you call "reftime" is what we think of as the  
>> initial time of each ensemble -- exact naming of this to be  
>> determined by CF consensus. This variable should have dimension  
>> "ensemble" not "time" with the first 9 values referring to  
>> 1feb2000 and the 2nd 9 values referring to 1feb2001.
>
> Please, could you confirm that the way a variable is named in the  
> NetCDF file is not relevant and what is actually meaningful is the  
> standard name?
That may be true for some clients, but GrADS uses variable name as  
the basis for all the I/O. The "standard_name" attribute is never  
looked at (unless a user makes a request to list all the variable  
attributes, in which case it would be printed out for the user to read).

> Why does reftime need to have dimension "ensemble" when it refers  
> to the start of the forecasts?
Because each ensemble member has a unique reference time.

> In an ensemble forecast, all members are expected to have the same  
> start date and span the same lead time.
GrADS does not require that constraint on ensemble members. In our  
data model, ensemble members may have different start dates and  
different lengths. A real-world example of this type of ensemble  
group is the NCEP Climate Forecast System ( http://cfs.ncep.noaa.gov/ ).

>
>> 3. A new variable, called something like "ensemble_length" (once  
>> again, I defer to CF lexicon) has dimension "ensemble" and gives  
>> the number of time steps in each ensemble, in this case all  
>> elements will have a value of 6.
>
> As I understand it, this concept of "ensemble_length" is the same  
> as "forecast_period" in the file.
Ensemble length is the number of time steps in an individual member.  
It doesn't have units "days since ..." it is just an integer that  
will be equal to or less than the number of time steps in the time  
axis for the 5D grid (the one mentioned above that spans all the  
members).

>
>> 4. I'm not sure what the variable time_bnd is used for.
>
> This is based on the CF cells concept and it's used to indicate the  
> operation performed in the leadtime variable (which is the one that  
> uses the cells "time_bnd") as referred to in the physical variable  
> attribute "cell_methods". In the example, they give the limits of  
> the month.
This is all brand-new jargon to me. GrADS and GDS don't use any of  
these concepts. I am clearly a CF neophyte.

> If you want to discuss this more in detail over the phone, let me  
> know where and when I can give you a call.

Well, the discussion of ensemble metadata seems to be forking a  
little, with theory and practice taking different paths. I think it's  
useful to keep both sides engaged in the discussion. I hope the above  
clarifies things a little for you.

Jennifer

--
Jennifer M. Adams
IGES/COLA
4041 Powder Mill Road, Suite 302
Beltsville, MD 20705
jma at cola.iges.org



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.cgd.ucar.edu/pipermail/cf-metadata/attachments/20061102/72e04d49/attachment.html


More information about the CF-metadata mailing list