[CF-metadata] Encoding Errors on variables in CF

Bryan Lawrence b.n.lawrence at rl.ac.uk
Fri Apr 4 09:00:02 MST 2003


Hi Brian et al.

Brian said 

> ... lots of good examples ...

> Summary of proposed changes to CF:
> . (1) Add the "error_varibles" attribute to link data and error variables.
> . (2) Add new standard_name values to provide descriptions of the error 
types.
> . (3) Add the "flag_values" and "flag_meanings" attributes to describe flag
>   variables.
(I added the enumeration)

The bottom line in the rest of this is that I agree, but am a bit worried that 
there could be too many types of errors to use standard names ... and am 
tempted by a backwards link from error variables to primary variables as 
well.

ok, in more detail:

I'm sorry I didn't reply to Jonathan's Sunday message on this issue; I still 
had it in my inbox to do so ... but I had I done so I think I would have 
proposed exactly what Brian suggested in (1): that is - add error variables 
as a forward pointer from the data to the error. I think this makes the most 
sense in terms of the way one would use data from the file ... 

This is in part because I don't agree we should be making the assumption that 
the error variable has no meaning without the company of the variable to 
which it applies: one does error budget studies where one might want to 
process the errors themselves to produce statistics about the errors ...

Clearly plotting is easier too; parse a file for a variable read it's 
metadata, get a pointer to the appropriate error rather than reading all 
variable metadata and finding the right one ... (not that it's a big 
overhead).

However, if we now have no info in the error variable to make it clear that it 
is an error variable then there are potential problems with interpretation if 
you didn't happen to use the original variable and find out that it had an 
error associated with it. We can get around that either with Brian's 
suggestion (2) or with Jonathan's intent option, or with comments. I suspect 
that the variety of possible error types might preclude a satisfactory list 
of standard names, but I may be wrong. 

I'm still tempted by the intent option  (which is essentially a backward 
pointer, so we end up with some redundancy, but that may be no bad thing from 
the point of view of a piece of dumb software which could put things in any 
old order). We could then have the best of both worlds.

If we did that, we'd have, Brian's example:

float no2(time) ;
  no2:standard_name = "no2_mixing_ratio" ;
  no2:long_name = "Nitrogen Dioxide Mass Mixing Ratio" ;
  no2:units = "1-e9" ;
  no2:error_variables = "no2_error_limit no2_detection_limit" ;
float no2_error_limit(time) ;
  no2_error_limit:long_name = "Nitrogen Dioxide Error Limit" ;
  no2_error_limit:units = "1-e9" ;
  no2_error_limit:comment = "Units are given in parts per
    billion by volume. The error limit is quoted for 2 sigma random errors 
plus systematic uncertainties derived from cross-sectional fits." ;
float no2_detection_limit(time) ;
  no2_detection_limit:intent="error budget"
  no2_detection_limit:long_name = "Nitrogen Dioxide Detection Limit" ;
  no2_detection_limit:units = "1-e9" ;
  no2_detection_limit:comment = "Units are given in parts per
    billion by volume. The detection limit is quoted for 2 standard 
deviations." ;

And on no2_detection_limit and no2_error_limit  we could  add, eg
   no2_error_limit: error_target="no2"

OR

  no2_error_limit: intent="error_budget"
(which isn't a link backward, but at least makes it clear that one could go 
looking for a variable with a forward link if one wanted to).

In both cases user software would then be aware that these variables were once 
attached to another variable - even if they are no longer because some other 
software has extracted it into a file on its own).

These would allow us to use standard names for errors for the "normal" errors 
as well ... so the comments would only need to be parsed for more details 
when the standard name table wasn't extensive enough to cover the type of 
error. 

I'm totally in support of (3): using flag values and meanings as suggested by 
Jonathan.

One final point: the variables and their error variables will inevitably get 
seperated in processing of these netcdf files. This shouldn't be seen as a 
problem, but the attributes at least would indicate that they once had 
partners, and that in and of itself may be of interest even if the partner 
isn't in the file one is using ...

Bryan

-- 
Bryan Lawrence, Head NCAS/British Atmospheric Data Centre
web: www.badc.nerc.ac.uk  phone: +44 1235 445012
CLRC: Rutherford Appleton Laboratory, Chilton, Didcot, OX110QX, UK


More information about the CF-metadata mailing list