[CF-metadata] bounds

John Caron caron at unidata.ucar.edu
Fri May 23 10:21:48 MDT 2003


Bryan Lawrence wrote:

>Hi Folks
>
>Speaking as a person who has done budget studies in the past, I think exact 
>comparisons are the way to go ... not only in the variables, but in their 
>coordinates ...
>
>  
>
>>What would be a "contrived" example where 1E-5 would not work ?  We are
>>not trying to preserve numerical results in the face of complex
>>algorthims, just trying to answer the question if the bounds are
>>continuous.
>>    
>>
>
>Ok, if my cells are in the mesosphere of an atmospheric model, particularly 
>one that includes the troposphere, then the cells are going to be pretty 
>deep, possibly a couple of orders of magnitude in difference in pressure 
>coordinates .... if I then couple on a thermospheric model, and I'm stupid 
>enough to carry on using  pressure as the vertical coordinate, then my cells 
>will become many orders of magnitude apart ... I can imagine failing the 
>above equality test in that situation ... even when the cells are contiguous 
>... other coordinate systems at that altitude may have even more problems.
>
well, russ has convinced me that the tolerence testing algorithm can be 
non-trivial, and i actually cant come up with a non-contrived use case 
as to how the two numbers could be generated by different processes. So 
i will withdraw my objection to exact testing.

having said that, my instincts as a numerical programmer are never to do 
exact testing on floating point, and in my heart of hearts i think the 
right thing to do is to use a good-as-possible  tolerence test. Even 
when that might fail on some extreme case,  the way to fix that is to 
use a CF convention which just explicitly says "non-contiguous 
boundary", so that the tolerence test is bypassed.

>
>As a personal aside:
>
>  
>
>>Someone has tried to create a CF netcdf file and bungled it, by not 
>>getting the coordinate bounds correct. So I am going to use NcML to fix 
>>the problem, by adding some of the values in "by hand".
>>    
>>
>
>I know you are using ncml to generate catalogues, and so I can see the 
>incentive for *fixing* the ncml associated with files , however, if folk are 
>using the ncml to instantiate code for doing things with the data based on 
>the ncml, I think it would be dangerous to rely on "by-hand-fixes" in
>the markup that are not representative of the actual data files ...
>
>.... having said that, I know about real users and what they do ...
>  
>
what we are seeing a lot of in THREDDS are files that are meant to be 
COARDS compliant, but made some mistake. Or they have no coordinate 
variables or Conventions at all, but those can be found and added easily 
enough. The "NcML Dataset" is a way to fix those mistakes or add the 
missing info. Its true that someone trying to fix the problem might also 
make some mistake, but if the whole thing is transparent, someone should 
be able to find the problem and collaboratively fix it. at least thats 
the hope.

I would expect NcML Dataset to be used mostly by data providers 
themselves in their data servers and catalogs,  who are making their 
archives available and want to fix their past sins. We are experimenting 
with retrofitting datasets into COARDS and CF Conventions, for example.

I hope to have some decent web pages up soon that will explain the 
basics, in case anyone wants more info. Note that this part of NcML is 
still alpha, although we hope to call it beta soon.



More information about the CF-metadata mailing list