[CF-metadata] CF and a representation of probalistic forecasts

Pamment, JA (Alison) J.A.Pamment at rl.ac.uk
Thu Oct 12 05:52:27 MDT 2006

Hi All,

In the update of the standard name table that took place on the 26th
September 2006 the following name was added:
realization; 1; realization is used to label a dimension that can be
thought of as a statistical sample, e.g., labelling members of a model

This resulted from a proposal by Jamie Kettleborough (originally for the
name "sample" although "realization" was later substituted).

Jamie also requested a new standard name modifier which we would now
like to call "realization_weight" with units of "1".  Please can this be
added to the list of modifiers in Appendix C of the CF 1.0 doc?


On 03 May 2006 10:30 Jamie Kettleborough wrote:
> Hello,
> there are a couple of projects (Hadley Centre QUMP project and
> climateprediction.net) that will be distributing
> probabilistic forecasts of climate change based on ensembles of model
> runs.  One way of representing  these
> results will be a set of model runs and a set of weights that should
> applied to each model run.
> I think this  can be accommodated straight forwardly, and reasonably
> generally, in the CF standard
> using 'ancillary_variables' and the addition of
> 1) New standard name 'sample' used to label a dimension that can be
> thought
>                                         of as a statistical sample
> ensemble)
> 2) New standard name modifier 'sample_weight' used to label variables
> are acting
>                                         as weights for other
> e.g. course map of predictions of 21st century temperature change
> In this example temperature is dimensioned by sample as well as the
> space and time dimensions.  Each
> sample is the result of one model run.  Some models are less realistic
> than others and so should be down weighted
> in any subsequent analysis.  The weights variable gives the weight for
> each ensemble member.
> dimensions:
>   lat = 18 ;
>   lon = 36 ;
>   time = 10 ;
>   sample = 10000 ; // sample points
> variables:
>   float temp(sample,time,lat,lon) ; // each sample is the result from
> ensemble member
>     temp:long_name = "Temperature at 1.5m" ;
>     temp:standard_name = "air_temperature" ;
>     temp:ancillary_variables = "weights" ;
>     temp:source = "perturbed physics ensemble of HadSM3" ;
>   float weights(sample) ;  // the weight applied to each ensemble
>     weights:long_name = "likelihood weights for 1.5m air temperature"
>     weights:standard_name = "air_temperature sample_weights" ;
> Notes:
> 1. The sample points can be generated from a perturbed physics
ensemble or
> a detection attribution
>  exercise (or possibly some other statistical method) so don't think
> want to explicitly use the term
>  'ensemble'. 'sample' is better. (though potentially confusing with
> samples or bucket samples?
>   - maybe 'distribution_sample is a better name?)
> 2. If the sample dimension is not identified by its standard name then
> there is an implied rule that
>  the software has to infer which dimension to apply the weights to
> on the common dimension.
> 3. sample_weight variables have an implied valid_min=0, and
>   (although the valid_max may be relaxed if you are prepared to
> renormalise later)
> 4. The 'ancillary_variable' attribute may point to more than one
> sample_weight. This might represent
>  different sensitivity studies, different observations used for skill
> scores, or different methodologies.
>  In this case each sample_weight should be thought of as applied stand
> alone. They are not applied in sequence.
> 5. The same sample_weight variable can be referenced by more than one
> variable. This is useful for forming
>  joint (multidimensional) pdfs between variables. In this case
> the ordering of the samples is
>  arbitrary it must be used consistently: the same order should be used
> all variables in the file.
> 6. The creation method of the sample points and associated weights
> be left to description in
>  'source' attribute (which may refer to URL for more information).  In
> case of perturbed physics
>  ensembles the derivation of weights can be complex so reference to
> external documents to describe the method
>  will avoid unnecessarily overloading the usage metadata.
> 7. in other examples the weights might be a function of space and time
> well as sample member.
> I hope this all makes enough sense for people to make a judgement on
> whether this should be accepted or not.
> Obviously if I've been unclear let me know and I'll try and be more
> eloquent.  If this all makes sense there
> will be a few follow up e-mails with specific requests for standard
> There are a couple of other representations of probabilistic forecast
> might be used.  These can be posted
> as separate suggestions as and when needed.
> Thanks,
> Jamie
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://www.cgd.ucar.edu/mailman/listinfo/cf-metadata

Alison Pamment                            Tel: +44 1235 778065
NCAS/British Atmospheric Data Centre      Fax: +44 1235 445858
Rutherford Appleton Laboratory            Email: J.A.Pamment at rl.ac.uk
Chilton, Didcot, OX11 0QX, U.K.

More information about the CF-metadata mailing list