[CF-metadata] axis attribute

Jonathan Gregory j.m.gregory at reading.ac.uk
Tue Apr 11 08:36:08 MDT 2017


Dear David and Sebastien

In a given model the level numbers are meaningful indeed, but there's no
universal convention for them. Maybe some atmosphere models number them from
the TOA downwards, for instance. Nothing would prohibit that, and the
standard_name of model_level_number doesn't prescribe any convention for it.
There is likewise no universal convention for how to number the gridboxes in
x and y, but in any given model there is a convention for it. Thus I think
it's the same. The standard name identifies a conventional concept, but not a
convention for the numbers themselves.

As for the axis attribute itself, maybe we could distinguish the two meanings
of "axis" and avoid backwards compatibility by removing the restriction in
chapter 5 that it is not permissible for a data variable to have both a
coordinate variable and an auxiliary coordinate variable having an axis
attribute with any given value. That is, at present you can't have both a 1D
x-coordinate variable with axis="X" and a 2D longitude auxiliary coordinate
variable with axis="X". But we could allow them both, with different meanings.
We could say that the axis attribute of 1D coordinate variables labels the
index dimensions of the data variable, while the axis attribute of multi-
dimensional auxiliary coordinate variables labels them as spatiotemporal
dimensions, as a hint for plotting.

Best wishes

Jonathan

----- Forwarded message from David Hassell <david.hassell at ncas.ac.uk> -----

> Date: Tue, 11 Apr 2017 08:41:34 +0100
> From: David Hassell <david.hassell at ncas.ac.uk>
> To: Jonathan Gregory <j.m.gregory at reading.ac.uk>
> CC: CF Metadata <cf-metadata at cgd.ucar.edu>
> Subject: Re: [CF-metadata] axis attribute
> 
> Hello all,
> 
> I am still uncomfortable with creating a coordinate variable with arbitrary
> values. The analogy with model_level_number is not quite there, I think, as
> the model_level_number values are not arbitrary. For example, a value of 6
> means (in one model I know) that this is the first level above the boundary
> layer. That value and meaning is relevant however the data may have been
> subspaced/sliced.
> 
> Perhaps an auxiliary coordinate variable could be used with missing data in
> all of its values, an axis attribute (of "X" or "Y") and *no* standard
> name? In the absence of a coordinate variable for that dimension, the axis
> attribute of the auxiliary coordinate variable would give meaning to the
> dimension.
> 
> This is still not ideal, though ...
> 
> All the best,
> 
> David
> 
> 
> 
> On 10 April 2017 at 18:25, Jonathan Gregory <j.m.gregory at reading.ac.uk>
> wrote:
> 
> > Dear Sébastien
> >
> > Yes, you're right, we can't have an axis without coordinates. This is
> > because
> > CF is a netCDF convention, and there's no way to attach attributes to a
> > dimension other than by creating a coordinate variable. Similarly we don't
> > have data variables with just dimensions but no data. We could have
> > conventions
> > for these concepts in CF-netCDF but it hasn't been proposed.
> >
> > However it seems pretty good to me to do as you suggest and create a
> > coordinate variable with the gridbox indices in it, for which (as
> > discussed)
> > we could define a standard name. This is analogous to model_level_number,
> > which is already a standard name and could be a vertical coordinate
> > variable.
> > It can do what you want, and I agree it makes sense to label these
> > coordinate
> > variables as X and Y. That indicates that they are the horizontal
> > dimensions.
> > It's probably less important which is X and which Y. That's a plotting
> > issue.
> >
> > I think the confusion probably arises because we didn't define what we mean
> > by "axis". We used the word as an "obvious" concept, but there is an
> > ambiguity
> > about whether it corresponds to a dimension of a data variable, or a
> > physical
> > variable (not necessarily spatiotemporal) which could be an independent
> > variable on which the data depends. Latitude might be an axis in the second
> > sense even if it's not an axis in the first sense. I prefer the first
> > sense,
> > which is the one the axis attribute originally had.
> >
> > Best wishes
> >
> > Jonathan
> >
> >
> > ----- Forwarded message from Sebastien Villaume <
> > sebastien.villaume at ecmwf.int> -----
> >
> > > Date: Fri, 7 Apr 2017 09:10:09 +0000
> > > From: Sebastien Villaume <sebastien.villaume at ecmwf.int>
> > > To: David Hassell <david.hassell at ncas.ac.uk>
> > > CC: CF Metadata <cf-metadata at cgd.ucar.edu>, Jonathan Gregory
> > >       <j.m.gregory at reading.ac.uk>
> > > Subject: Re: [CF-metadata] axis attribute
> > > X-Mailer: Zimbra 8.6.0_GA_1200 (ZimbraWebClient - FF50
> > (Linux)/8.6.0_GA_1200)
> > >
> > > Dear David,
> > >
> > > I see your point and you are probably right that a plotting routine will
> > probably sort this out.
> > >
> > > However this is not what I am after, I am more interested in metadata
> > discovery and indexing.
> > > I need to discover what I have in a file without plotting it, without
> > having a human looking at it to confirm what it is and that it has been
> > plotted correctly.
> > > I also would like to use these metadata informations to perform actions
> > like merging netCDF files, slicing, cropping, aggregating, interpolating,
> > comparing data in different grids and representations, etc.
> > >
> > > I understand that implicit is fine and that explicit is not required for
> > some applications. I have no issue with this.
> > > My personal point of view is that explicit is better than implicit: I
> > tend to prefer "mandatory" over "optional".
> > >
> > > Being implicit means that the assumptions made need to be valid 100% of
> > the time to avoid accidents or corner cases.
> > > I would like to be explicit so I need all the proper mechanisms
> > (variables, semantics, etc.) in place so I can use them.
> > > Right now it feels that I am missing some functionality.
> > >
> > > Let me copy below few bits of the terminology section in the CF 1.7
> > draft document (very similar to 1.6). Please read it keeping in mind what
> > is really an axis, a coordinate, a spatio-temporal dimension and an an
> > array dimension. Each time you read "coordinate2, "dimension" or
> > "dimensional", ask yourself what is implied and if it is not ambiguous:
> > >
> > > ------------------------
> > > variables
> > > ------------------------
> > > auxiliary coordinate variable
> > >     Any netCDF variable that contains coordinate data, but is not a
> > coordinate variable (in the sense of that term defined by the NUG and used
> > by this standard - see below). Unlike coordinate variables, there is no
> > relationship between the name of an auxiliary coordinate variable and the
> > name(s) of its dimension(s).
> > >
> > > coordinate variable
> > >     We use this term precisely as it is defined in section 2.3.1 of the
> > NUG . It is a one-dimensional variable with the same name as its dimension
> > [e.g., time(time) ], and it is defined as a numeric data type with values
> > that are ordered monotonically. Missing values are not allowed in
> > coordinate variables.
> > >
> > > grid mapping variable
> > >     A variable used as a container for attributes that define a specific
> > grid mapping. The type of the variable is arbitrary since it contains no
> > data.
> > >
> > > multidimensional coordinate variable
> > >     An auxiliary coordinate variable that is multidimensional.
> > >
> > > scalar coordinate variable
> > >     A scalar variable (i.e. one with no dimensions) that contains
> > coordinate data. Depending on context, it may be functionally equivalent
> > either to a size-one coordinate variable (Section 5.7, "Scalar Coordinate
> > Variables") or to a size-one auxiliary coordinate variable (Section 6.1,
> > "Labels" and Section 9.2, "Collections, instances, and elements").
> > >
> > > ------------------------
> > > dimensions
> > > ------------------------
> > > latitude dimension
> > >     A dimension of a netCDF variable that has an associated latitude
> > coordinate variable.
> > >
> > > longitude dimension
> > >     A dimension of a netCDF variable that has an associated longitude
> > coordinate variable.
> > >
> > > spatiotemporal dimension
> > >     A dimension of a netCDF variable that is used to identify a location
> > in time and/or space.
> > >
> > > time dimension
> > >     A dimension of a netCDF variable that has an associated time
> > coordinate variable.
> > >
> > > vertical dimension
> > >     A dimension of a netCDF variable that has an associated vertical
> > coordinate variable.
> > > ------------------------
> > >
> > > So according to this terminology, I have in my file, 2 auxiliary
> > coordinates variables, but no "real" coordinates variables (according to
> > the NUG) so my auxiliary coordinates are auxiliary to what?
> > > What is a "multidimensional coordinate"? if dimension means
> > spatio-temporal dimension it is a non sense because a coordinate can only
> > reference 1 spatio-temporal dimension, if it is meant to be
> > array-dimensions it is not clear...
> > > What are my 2D array latitude and longitude then? are they latitude and
> > longitude dimension defined in the terminology? not really.... because
> > there are no such things as latitude and longitude dimension: you can
> > define latitude and longitude coordinates, associated with 2 axis that
> > themselves define 2 spatial dimensions... but the coordinates can be
> > defined in whatever n-D array.
> > > I like the definition of "grid mapping variable", I could use a similar
> > variable to be a container for attributes for my "axis variable" with no
> > data!
> > >
> > > I know that in the day-to-day life and discussions we don't make the
> > effort to be precise (I don't) and that it is easy to overload the meaning
> > of things but I think that the CF document needs to be very precise, non
> > ambiguous and can not mix axes, coordinates, spatio-temporal and array
> > dimensions.
> > >
> > > /Sébastien
> > >
> > > ----- Original Message -----
> > > From: "David Hassell" <david.hassell at ncas.ac.uk>
> > > To: "Sebastien Villaume" <sebastien.villaume at ecmwf.int>
> > > Cc: "CF Metadata" <cf-metadata at cgd.ucar.edu>, "Jonathan Gregory" <
> > j.m.gregory at reading.ac.uk>
> > > Sent: Friday, 7 April, 2017 08:37:20
> > > Subject: Re: [CF-metadata] axis attribute
> > >
> > > Dear Sébastien,
> > >
> > > Please bear with me when I ask to right back to the beginning! I am not
> > > sure what the benefit is in labelling the dimensions as X or Y. In the
> > > original tripolar case we have:
> > >
> > > dimensions:
> > >     i = 96 ;
> > >     j = 73 ;
> > > variables:
> > >     float latitude(j, i) ;
> > >         latitude:units = "degrees_north" ;
> > >     float longitude(j, i) ;
> > >         longitude:units = "degrees_east" ;
> > >     float sit(j, i) ;
> > >         sit:units = "m" ;
> > >         sit:standard_name = "sea_ice_thickness" ;
> > >         sit:coordinates = "latitude longitude" ;
> > >
> > > There is nothing stopping anything from seeing that this is 2-d array of
> > > size i*j, and there is nothing stopping software subpacing the data by i
> > > and j indices.
> > >
> > > I don't think a plotting routine would benefit from knowing that the i
> > > dimension was "X", because there are no 1-d coordinates it can use along
> > > that dimension.
> > >
> > > Many thanks and all the best,
> > >
> > > David
> > >
> > > On 6 April 2017 at 22:45, Sebastien Villaume <
> > sebastien.villaume at ecmwf.int>
> > > wrote:
> > >
> > > > Dear Mark and Jonathan,
> > > >
> > > > thank you for your comments.
> > > >
> > > > @Mark:
> > > > the short answer: you can put in principle whatever you want in that
> > > > variable because in this case it is a dummy variable only there to
> > hold the
> > > > axis attribute. But please read the long explanation!
> > > >
> > > > the long, boring explanation:
> > > > As I understand it, the CF convention does not recognize axis as a
> > valid
> > > > object on its own like for "dimensions" and the various type of
> > "variables"
> > > > and the convention seems to make it mandatory to attach to it a
> > variable
> > > > that becomes a "coordinate" variable. Note that I say that it is the
> > > > coordinate variable that is attached to the axis and not the opposite.
> > > >
> > > > From a mathematical point of view, it is perfectly possible to define
> > an
> > > > axis without a coordinate on it (arguably it is not that useful). The
> > > > common case is that a 1-D array defines positions on that axis (the
> > > > coordinate). Then your 1-D data points are positioned with the help of
> > the
> > > > coordinate, itself attached to the axis.
> > > >
> > > > If you have one more axis, you can define a new coordinate on it. This
> > > > creates a 2-D space. Now you have the choice on how you represent your
> > 2-D
> > > > data points:
> > > > if the dataset is totally irregular you will have a 1-D array of "n"
> > data
> > > > points associated with a 1-D array of "n" positions for the first
> > dimension
> > > > and a 1-D array of "n" positions for the second dimension. It works,
> > it is
> > > > still a 2-D dataset stored in a long one dimensional vector.
> > > >
> > > > Imagine that you realize that your dataset is not as irregular as you
> > > > thought, it is in fact a regular grid! you identify that you only have
> > i
> > > > possible values of the first coordinate and j possible values for the
> > > > second coordinate, you also notice that i*j=n. Great you can now
> > represent
> > > > your dataset with 2 coordinates of length i and j respectively, each of
> > > > them associated with 2 axes x and y and your data is now a 2-D array of
> > > > size (i,j). you can position your data using the coordinates, it is
> > mapped
> > > > using the indices within each coordinate array. Now you have a 2-D
> > spatial
> > > > dataset sored in a 2-D array with 2 supporting 1-D spatial coordinates
> > > > stored in one dimensional vectors.
> > > >
> > > > Lets say now that you take this regular grid and you distort it... your
> > > > regular grid is gone you can no longer use i and j for partitioning!
> > > > really? well no, nobody says that you can not slice your "n" long
> > vectors
> > > > into i*j arrays! you could choose whatever you want for i and j as
> > long as
> > > > i*j=n. Of course if you choose (2)*(n/2) or (n/2)*(2), it is a bit
> > useless,
> > > > but you can also choose meaningful i and j because even if your grid
> > became
> > > > irregular, it is not random points, it is still a grid of size i*j .
> > This
> > > > is exactly my use case! And in that situation your coordinates can be
> > > > arranged in arrays of size i*j. What I need is 2 axes and 2
> > coordinates of
> > > > dimension 2 with lengths i and j. The catch here is that I have 2-D
> > arrays
> > > > to store one "spatial" dimension! It is another case of overlapped
> > > > concepts, dimension is used transparently for the dimension of arrays,
> > > > dimension of the geometrical space, and sometimes for the size of one
> > of
> > > > the dimensions of an array!!
> > > >
> > > > Anyway, I should be able to define my axes like this:
> > > >
> > > > int x;
> > > >     x:axis = "X";
> > > >     x:standard_name = "x_axis" ; // no standard name exists...
> > > >     x:units = "1" ; // no units, it will come with the coordinate
> > > > int y;
> > > >     y:axis = "Y";
> > > >     y:standard_name = "y_axis" ; // no standard name exists...
> > > >     y:units = "1" ; // no units, it will come with the coordinate
> > > > float longitude(j,i);
> > > >     longitude:standard_name = "longitude" ;
> > > >     longitude:units = "degrees" ;
> > > >     longitude:positive = "east" ;
> > > >     longitude:long_name = "longitude" ;
> > > >     longitude:axis_mapping = "X" ;
> > > > float latitude(j,i);
> > > >     latitude:standard_name = "latitude" ;
> > > >     latitude:units = "degrees" ;
> > > >     latitude:positive = "north" ;
> > > >     latitude:long_name = "latitude" ;
> > > >     latitude:axis_mapping = "Y" ;
> > > > float sit(j, i) ;
> > > >     sit:units = "m" ;
> > > >     sit:standard_name = "sea_ice_thickness" ;
> > > >     sit:long_name = "Ice thickness" ;
> > > >     sit:coordinates = "latitude longitude" ;
> > > >
> > > > several comments:
> > > > notice how one could tell on which axis the coordinate should go using
> > for
> > > > instance a "axis_mapping" attribute. Not a "coordinate" attribute,
> > this one
> > > > should be used to tell the coordinates of my data variable!
> > > > I find this approach clearer and more flexible as it can probably cater
> > > > for any situation of axes, coordinates, etc.
> > > >
> > > > But because in CF one cannot create bare axis, I follow the rules and
> > > > creates:
> > > >
> > > > double x(i);
> > > >     x:axis = "X";
> > > >     x:standard_name = "..." ; // not an axis anymore, give me a
> > standard
> > > > name
> > > >     x:units = "1" ;
> > > >     y:long_name = "i-index of mesh grid" ;
> > > > double y(j);
> > > >     y:axis = "Y";
> > > >     y:standard_name = "..." ; // not an axis anymore, give me a
> > standard
> > > > name
> > > >     y:units = "1" ;
> > > >     y:long_name = "j-index of mesh grid" ;
> > > >
> > > > and I have the choice of what I put in those arrays since it is somehow
> > > > artificial.
> > > >
> > > > I could populate the "primary" coordinates with 1 to i and 1 to j which
> > > > would represent the indices and if I subset the grid, I then retain the
> > > > information that the domain has been cropped because the indices left
> > will
> > > > not be 1 to i/j but n to m.
> > > > I don' t really like this but what can I do?
> > > >
> > > > If we follow this idea, it means introducing a clear concept from
> > "axis"
> > > > besides the other types of variables, defining new attribute to
> > "attach"
> > > > coordinates to axes, etc.
> > > >
> > > > Another solution, much less disturbing, would be to heavily modify the
> > > > proper chapters in the CF document to:
> > > > - completely decouple the concepts of "axis" and "coordinate": a
> > > > coordinate is not an axis and vice versa.
> > > > - completely decouple the concepts of spatio temporal dimension from
> > array
> > > > dimension from the size the array dimension
> > > > - continue to use the "axis" attribute  but on n-D array coordinates:
> > the
> > > > array has n-D dimensions but the coordinate map to 1 axis/spatial
> > dimension
> > > > only!
> > > > - Whatever the dimensions of the array for the coordinate, all the
> > values
> > > > contained in the array must be mapped on one given axis, the one
> > defined in
> > > > axis attribute. For instance, a 2-D latitude only contains values that
> > are
> > > > latitudes and will only map on one axis.
> > > > - In principle one could have in the same file several coordinates of
> > > > possibly different "array" dimensions, different sizes and different
> > units
> > > > defined for one axis. This means that the attribute "axis=z" for
> > instance
> > > > can appears more than once in the file. The only restriction I see is
> > that
> > > > 2 data variables can be only plotted simultaneously if all their
> > > > coordinates share the same units (the coordinate mapped on one axis of
> > the
> > > > first data variable must have the same units than the coordinate
> > mapped on
> > > > the same axis for the other data variable). This allow 2 data variables
> > > > defined on two different grid sharing the same units to be in the same
> > file
> > > > and plotted together.
> > > > - X and Y should be clearly decoupled from longitude and latitude. X
> > and Y
> > > > are the axes, longitude and latitude are the coordinates!
> > > >
> > > >
> > > > @Jonathan:
> > > > I think the whole confusion here comes from the overlapping of
> > concepts:
> > > > axes and coordinates on one hand and dimension of arrays and spatial
> > > > dimensions on the other hand. If the relevant chapters are rewritten
> > > > carefully to separate axes from coordinates and array dimensions from
> > > > spatio-temporal dimensions we are good.  think
> > > >
> > > > @all: Reading more through the Trac tickets system, I noticed the nice
> > > > Trac ticket 117 about "multiple" time axis. This is a nice example of
> > > > mixing axes, coordinates, dimensions of arrays, the time dimension,
> > etc!
> > > >
> > > >
> > > > /Sébastien
> > > >
> > > > ----- Original Message -----
> > > > From: "Jonathan Gregory" <j.m.gregory at reading.ac.uk>
> > > > To: cf-metadata at cgd.ucar.edu
> > > > Sent: Thursday, 6 April, 2017 16:49:56
> > > > Subject: Re: [CF-metadata] axis attribute
> > > >
> > > > Dear Jim and Sebastien
> > > >
> > > > The original intention of axis was to label the independent variables
> > as 1D
> > > > xyzt axes of the data variables.  This can be deduced from other
> > > > attributes,
> > > > but it's more effort. It's partly a plotting hint, but also it's
> > because
> > > > you
> > > > might reasonable want to tell software, "give me the z-axis
> > coordinates",
> > > > or
> > > > "calculate a mean over the x-direction". The latter is often a zonal
> > mean,
> > > > but
> > > > it isn't with a rotated-pole or tripolar grid, yet the operation is
> > still
> > > > performed sometimes.
> > > >
> > > > It's useful that you've pointed out the confusion of purpose. If it
> > were
> > > > regarded as an acceptable backwards-incompatibility, which I'm nervous
> > > > about,
> > > > I'd be happy if we returned "axis" to its original purpose of
> > identifying
> > > > 1D
> > > > axes, and also for scalar coordinate variables (which are equivalent to
> > > > axes
> > > > of size one), and provided another attribute to label aux coords as
> > > > horizontal.
> > > >
> > > > I agree that if we have 1D x and y, with 2D lat and lon, the 1D
> > variables
> > > > are
> > > > the axes. That's consistent with the original purpose of the axis
> > > > attribute.
> > > >
> > > > > I also find the units of latitude and longitude confusing: it looks
> > like
> > > > it was a way to squeeze the direction of the coordinate inside the
> > units. I
> > > > have the same observation for the time coordinate that has its origin
> > in
> > > > the units!
> > > >
> > > > This convention was kept in CF for backwards-compatibility with
> > COARDS. CF
> > > > does
> > > > not use units in any other case to identify the quantity or sense.
> > > >
> > > > > It was done correctly for z coordinate using "units" and "positive",
> > > > probably because there are many types of z coordinates with various
> > origin
> > > > and directions, and no real consensus. I note however that often the
> > origin
> > > > is not always clearly defined.
> > > >
> > > > The positive attribute was also kept for backwards-compatibility with
> > > > COARDS.
> > > > It has the advantage of being useful to identify the vertical axis, but
> > > > this
> > > > can also be done with axis="Z". CF standard names provide information
> > which
> > > > indicates the sign convention.
> > > >
> > > > If coordinate_index is confusing, I think standard_names containing
> > x_index
> > > > or y_index would be OK, provided we change the existing standard names
> > > >   magnitude_of_derivative_of_position_wrt_x_coordinate_index
> > > >   magnitude_of_derivative_of_position_wrt_y_coordinate_index
> > > > to remove "_coordinate".
> > > >
> > > > Best wishes
> > > >
> > > > Jonathan
> > > > _______________________________________________
> > > > CF-metadata mailing list
> > > > CF-metadata at cgd.ucar.edu
> > > > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
> > > > _______________________________________________
> > > > CF-metadata mailing list
> > > > CF-metadata at cgd.ucar.edu
> > > > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
> > > >
> > >
> > >
> > >
> > > --
> > > David Hassell
> > > National Centre for Atmospheric Science
> > > Department of Meteorology, University of Reading,
> > > Earley Gate, PO Box 243, Reading RG6 6BB
> > > Tel: +44 118 378 5613
> > > http://www.met.reading.ac.uk/
> >
> > ----- End forwarded message -----
> > _______________________________________________
> > CF-metadata mailing list
> > CF-metadata at cgd.ucar.edu
> > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
> >
> 
> 
> 
> -- 
> David Hassell
> National Centre for Atmospheric Science
> Department of Meteorology, University of Reading,
> Earley Gate, PO Box 243, Reading RG6 6BB
> Tel: +44 118 378 5613
> http://www.met.reading.ac.uk/

----- End forwarded message -----



More information about the CF-metadata mailing list