[CF-metadata] subgrid variation

Bryan Lawrence b.n.lawrence at rl.ac.uk
Thu Feb 24 03:55:10 MST 2005


Hi Folks

Apologies in advance. This will read like a criticism of the outstanding work 
that has and is being done. It is, but hopefully both minor and constructive, 
and the fact that I can do it at all is because of the outstanding start that 
we have with CF ... and again apologies, this is long.

Ok, the subgrid variation.

I think this is one of the examples of things that no one is quite sure of the 
status of now ...

So, the issues this post raise are

1) CF-1.0 was presumably frozen in place.  I can't find easily a place where 
all the proposals for CF-2.0 are collected separately. Whatever we decide to 
do, we should cleanly separate proposals from CF-1.0. (This is not a 
criticism of Brian or anyone else, this is hard to do, and takes time, so we 
need to a) want to do it, and b) resource it).

2) The standard names.  Some of you will know that I think we need to address 
splitting CF into (at least) the following three components:
 - vocabularies
 - netcdf specific things
 - other stuff.

Having done that, we should separate the development track an possibly version 
each separately.  Anyway, limiting ourselves to vocabularies as they apply to 
variables and axes, as Jonathan and Steve both state, we have an issue with 
systemising modifiers (factorisation) of the standard names. We also have an 
issue with name proliferation (not in and of itself a problem, but a problem 
when we start to incorporate other namespaces and fixing them in place, e.g. 
selecting part of an external gazetteer, which then is evolves externally at 
a different pace).

This is an area where there is a lot of work on in other communities, and we 
should look at what they are doing, and use it. In particular, GML has a 
dictionary concept. Can we use that as a starting point. Can we then ask how 
we use that in a CF compliant way? I think so, it could be as simple as 
referencing the namespace associated with a particular standard name. That 
would allow people to use externally maintained namespaces. For example:  the 
CF community are not the right people to maintain an atmospheric chemistry 
namespace, but we do want atmospheric chemistry in our CF compliant files ...
(Is this going to be a big change from CF-1.0? No, not if we do it right, it 
could even be backward compatible).

3)  What about the modifiers issue? We need to come up with a method of 
dealing with modifiers of any sort in an abstract way, rather than creating 
epicycles every time we think of something new. OK, well, we're not going to 
do that today, so this should be something we put on our CF development 
track ...
 
On this specific proposal, if I quote 7.3 of CF-1.0:

	"Some methods (e.g., variance) imply a change of units of the variable, and 
this also is specified by Appendix D. "

Was anyone else aware that where we don't have a cell method we have to know a 
priori whether a quantity is extensive (depends on the size of the cell) or 
intensive (doesn't) to know what to with it. This makes perfect sense, except 
that this implies that all standard names need to have this as an attribute 
(or it is compulsory as an attribute of a variable, better but maybe even 
better to imply that things are intensive except when they have an attribute 
which says they're not). (Or do I have this all wrong?)

Meanwhile, this will be a nightmare to *use*. 

So, what do I think we should do? Clearly we need to keep the separation 
between vocabulary and usage (ie standard names and modifiers like cell 
methods). As Steve points out, this is broken already with cell methods, 
which can change the (implied/explicit) units of a standard name. I think we 
need to take a step back and ask ourselves how we can do this in a cleaner 
way. So this is another item of work we need for our CF development track ... 

At that point, we probably should think about the structures of our standard 
names. My personal point of view would be that we should try and identify the 
semantic things we care about, and join them together into standard names.  
For example, we have x_sea_water_velocity as an alias of 
sea_water_x_velocity. I'd have to code every one of these up (but if we 
really do expand, then take for example the BODC parameter dictionary which 
has 10,000 entries, no way I'm going to do that manually)! Surely we can 
divide this into two or three semantic parts, and the order is irrelevant. 
Doing this could be as simple as stating that semantic content is separated 
by _ and order is irrelevant (but it wont be that simple :-). Roy Lowry has 
the wonderful example of how this can get out of hand: the advent of green 
dogs (if we allow colour modifiers to be completely independent of animals). 
I suppose our example could be orography velocity ... 

I've got more thoughts on the standard names, but will save them for now, but 
by way of summary of this post, I think this email has raised three things we 
need to identify as needing more work, and for which we need someone to make 
a proposal about how to proceed.

1) Versioning
2) Dealing with variable modifiers, units and standard names
3) Dealing with dictionaries of names, and relationship with external 
namespaces.

We may need to decouple units and standard names to resolve these issues! 
Clearly we can't do that in CF 1.x, but we might in a future CF ...

I'm deliberately not raising specific proposed solutions now, because what I 
would like us to do is identify that these are structural problems that need 
resolution in a future version of CF, not to be hacked into CF-1.0. Of course 
folk (including me) can come up with ways of dealing with it for now, but 
they wont be CF (standard). 

Bryan

On Tuesday 08 February 2005 08:54, Jonathan Gregory wrote:
> Dear Steve
>
> > They
> > are now creating files which have "standard_name" attributes.  In most
> > cases they lack cell_methods attributes.  They believe that they are
> > creating CF 1.0 compliant files.  Does your proposal  "(2)" imply that
> > valid CF 1.0 files which use "standard_name" will become invalid CF 2.0
> > files?  If so isn't this a serious backwards compatibility concern?
>
> Yes, it would. Karl has also made that point. Strangely, Karl's posting
> doesn't appear on the news archive.
>
> I agree, we can only make cell_methods strongly recommended, rather than
> mandatory, in order that previous CF files remain valid. I think that such
> a recommendation is needed, because really people ought to record what they
> intend a quantity to be, and when there is subgrid variation of surface
> types, for instance, there is a real ambiguity. The standard name alone is
> not sufficient metadata.
>
> > Before we introduce further complexity to the sub-grid cell_methods
> > machinery (which is not one of the more "transparent" aspects of CF
> > already), has there been a serious exploration of alternatives?
>
> ...
>
> > I'm tempted to think that a thoughtful discussion on how to systematize
> > these modifiers is in order.
>
> So far, I have not managed to think of better ways myself, and believe me,
> I have spent hours thinking about it! But anyone's welcome to make
> proposals, of course. Yes, the standard_name and its modifiers and the
> cell_methods together specify the quantity. This is a step towards some
> "factorisation" of the description, in order to limit the expansion of the
> standard name table, and I think it's quite systematic. We are following a
> path between the extremes of putting all the definition in one attribute,
> and breaking it down into many attributes.
>
> > Listening
> > to the general community chatter there's no doubt that plenty of groups
> > have picked up the "standard_name" attribute and are thrilled to have
> > found it.
>
> That is encouraging. Of course, it means we need to arrange ways of making
> it easier to add and modify standard names. That's something I hope we can
> discuss at the meeting at the GO-ESSP meeting at the BADC in June.
>
> Best wishes
>
> Jonathan
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://www.cgd.ucar.edu/mailman/listinfo/cf-metadata

-- 
Bryan Lawrence,        Head NCAS/British Atmospheric Data Centre
Web: badc.nerc.ac.uk                      Phone: +44 1235 445012
CCLRC: Rutherford Appleton Laboratory, Chilton, Didcot, OX11 0QX


More information about the CF-metadata mailing list