[CF-metadata] Pre-proposal for "charset"

Bob Simons - NOAA Federal bob.simons at noaa.gov
Thu Mar 9 16:35:01 MST 2017


I'm very sorry if my comments sounded sarcastic or disrespectful. I did not
mean them that way at all. I used the word "supreme" as a sincere measure
of David Hassell's and Jonathan Gregory's estimable level of CF expertise.
I used the word "supreme" in the sense of definition #2 at
http://www.dictionary.com/browse/supreme : "of the highest quality, degree,
character, importance, etc.: supreme courage."  The point I was trying to
make when I wrote that email was based on me saying "supreme" sincerely,
not sarcastically.

I am deeply saddened by the way the discussion about my proposal turned
out.
I naively thought it was a relatively simple proposal that would be a
useful addition to CF.
I apologize for not expressing myself better and for pressing too
forcefully.

I have withdrawn both of my Trac tickets. Please, let's end this discussion.

I remain respectful of all that Jonathan Gregory has done for CF and the
broader community.
I have been and remain a huge supporter of CF, which has played a
tremendously positive role in this community.








On Thu, Mar 9, 2017 at 1:44 PM, Karl Taylor <taylor13 at llnl.gov> wrote:

> Dear Bob and all,
>
> I have not had time to follow this thread in detail, but a remark (in the
> most recent email) that seemed unnecessarily sarcastic caught my eye, and
> impelled me to look into what might have led to this dip in our normally
> courteous, respectful discourse.  As Chair of the CF Governance Panel, I
> feel obliged to remind everyone that the success (and fun!) of our endeavor
> depends on enthusiastic engagement, which is only discouraged if what ought
> to be earnest debate and substantive argument is disrupted (however briefly
> and unintentionally) by remarks that could be interpreted as being even
> slightly derogatory.  There are no "supreme authorities" here (although
> some are much more knowledgeable than others).  We progress by consensus,
> and only respectful contributions to the discussion can be tolerated.
>
> Enough said about that.
>
> Addressing only the part of the "pre-proposal" that suggests there is a
> need to explicitly distinguish strings from characters (not the part of the
> proposal that deals with the flavor of the 7 or 8 bit representation of
> characters):
>
> 1)  Note that the H.4 example being discussed was slightly modified (I
> think on 29th February 2016), and now includes "station_name" in the list
> of coordinates, thus explicitly linking it to the humidity and temp
> variables.  This along with the fact that station_name is *not* included as
> a dimension for these variables allows you to infer that this is a *single*
> station described by a character string of length 23, and not 23 stations
> with single character i.d.'s.
>
> 2)  If the coordinate dimension is *required* in this case (and currently
> it may not be), then software should be able to unambiguously interpret
> things.  [This requirement was suggestion 2 (of 3), made my Jonathan in one
> of his earlier comments.]
>
> best regards,
> Karl
>
>
>
> On 3/7/17 11:08 AM, Bob Simons - NOAA Federal wrote:
>
> Jonathan, I believe that you place an unreasonable burden on
> general-purpose software readers of netcdf-3 files, which you expect to
> include AI-like code which completely "understands" all possible CF files,
> just so it can tell the difference between char variables meant to be
> interpreted as chars and char variables meant to be interpreted as strings
> (by collapsing the rightmost dimension). The supreme authorities, you and
> David Hassell (your own employee?!), couldn't even agree on whether H4 was
> a valid CF file. How can you then demand that software do better?
>
> It is easy/trivial for software reading a netcdf-4 file (as defined in
> NUG) to distinguish char variables and String variables, why is it so wrong
> to ask for the same ease/clarity with netcdf-3 files?
> Part of my effort here was to start dealing with the massive rift between
> CF (which only covers netcdf-3 files) and NUG (which covers netcdf-3 and
> netcdf-4 files). Isn't that a reasonable goal?
>
> And even if you ignore the issue of distinguishing chars from strings,
> there is still no attribute in CF to specify the character set for char
> scalars and char arrays that are to be interpreted as chars.
> You can't say "_Encoding" because the default for _Encoding is "UTF-8",
> which is not a valid option for char scalars and char arrays because it may
> span multiple chars. The list of valid character sets for char scalars and
> char arrays (in netcdf-3 and netcdf-4 files) must be different from the
> list of valid _Encodings for strings. A different attribute, e.g., charset,
> is needed for chars (as opposed to strings) in netcdf-3 and netcdf-4 files.
>
>
>
> On Tue, Mar 7, 2017 at 9:03 AM, Jonathan Gregory <
> j.m.gregory at reading.ac.uk> wrote:
>
>> Dear Chris
>>
>> > We need to be "clear" about what we mean by "the intent is clear". I
>> think
>> > that much of the point of CF is to be as explicit as possible, -- i.e.
>> the
>> > reader of a CF file should not have to know anything about how given
>> data
>> > tends to be used in order to determine what data type an array should be
>> > (or what shape it should be).
>>
>> Yes, I agree with that. However, if you're reading a CF file, you aren't
>> just reading plain variables. If you're using/writing software which knows
>> how to interpret the file following the CF convention, it should know what
>> the "intent" is, in a CF context, of each of the variables of interest.
>> For example, you know that an auxiliary coordinate variable of char data
>> must
>> be a vector of strings, and the trailing or only dimension is the max
>> string
>> length. If you came across this variable when scanning all the variables
>> in
>> a netCDF file, with no interest in CF, you wouldn't know that it was an
>> array
>> of strings, but if you are using it as a CF aux coord var, you do know
>> that,
>> so I don't think any further signal is needed - it would be redundant.
>>
>> Best wishes
>>
>> Jonathan
>>
>> ----- Forwarded message from Chris Barker <chris.barker at noaa.gov> -----
>>
>> > Date: Mon, 6 Mar 2017 11:16:35 -0800
>> > From: Chris Barker <chris.barker at noaa.gov>
>> > To: Jonathan Gregory <j.m.gregory at reading.ac.uk>
>> > CC: "cf-metadata at cgd.ucar.edu" <cf-metadata at cgd.ucar.edu>
>> > Subject: Re: [CF-metadata] Pre-proposal for "charset"
>> >
>> > On Mon, Mar 6, 2017 at 9:47 AM, Jonathan Gregory <
>> j.m.gregory at reading.ac.uk>
>> > wrote:
>> >
>> > > Yes, we can reopen the ticket. I think the _Encoding for char is a
>> good
>> > > idea,
>> > > especially if it's an NUG convention.
>> >
>> >
>> > so let's do that part at least.
>> >
>> > > Are there any files out in the wild that DO use ND arrays of NC_CHAR
>> that
>> > > > are not intended to be interpreted as a (N-1)D array of Strings?
>> > >
>> > > That is the question. In particular, since this the CF convention
>> we're
>> > > talking about, are there any char arrays which are part of CF,
>> >
>> >
>> > indeed.
>> >
>> >
>> > > where the
>> > > intent is not clear?
>> > >
>> > We need to be "clear" about what we mean by "the intent is clear". I
>> think
>> > that much of the point of CF is to be as explicit as possible, -- i.e.
>> the
>> > reader of a CF file should not have to know anything about how given
>> data
>> > tends to be used in order to determine what data type an array should be
>> > (or what shape it should be).
>> >
>> > I saw this an an author of sometimes generic tools -- the tool should be
>> > able to read the file, and produce the appropriate native array for the
>> > task at hand, without knowing something like: "ahh, this is the ID of a
>> > Acme-ocean-widget -- those use char IDs -- so this must be a char" --
>> > Humans can do that -- software can't (not easily anyway!)
>> >
>> > And clearly specifying whether a char array is a char array or a string
>> > array will better unify netcdf3 and netcdf4.
>> >
>> > netcdf4 can be explicit about it -- netcdf3 can't -- so it'd be nice if
>> CF
>> > could fill that gap.
>> >
>> > Now that I think about it, this really should be a netcdf convention --
>> > like _FillValue, but this is a CF list....
>> >
>> > -CHB
>> >
>> > --
>> >
>> > Christopher Barker, Ph.D.
>> > Oceanographer
>> >
>> > Emergency Response Division
>> > NOAA/NOS/OR&R            (206) 526-6959 <%28206%29%20526-6959>   voice
>> > 7600 Sand Point Way NE   (206) 526-6329 <%28206%29%20526-6329>   fax
>> > Seattle, WA  98115       (206) 526-6317 <%28206%29%20526-6317>   main
>> reception
>> >
>> > Chris.Barker at noaa.gov
>>
>> ----- End forwarded message -----
>> _______________________________________________
>> CF-metadata mailing list
>> CF-metadata at cgd.ucar.edu
>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>>
>
>
>
> --
> Sincerely,
>
> Bob Simons
> IT Specialist
> Environmental Research Division
> NOAA Southwest Fisheries Science Center
> 99 Pacific St., Suite 255A      (New!)
> Monterey, CA 93940               (New!)
> Phone: (831)333-9878 <(831)%20333-9878>            (New!)
> Fax:   (831)648-8440 <(831)%20648-8440>
> Email: bob.simons at noaa.gov
>
> The contents of this message are mine personally and
> do not necessarily reflect any position of the
> Government or the National Oceanic and Atmospheric Administration.
> <>< <>< <>< <>< <>< <>< <>< <>< <><
>
>
>
> _______________________________________________
> CF-metadata mailing listCF-metadata at cgd.ucar.eduhttp://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>
>
>
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>
>


-- 
Sincerely,

Bob Simons
IT Specialist
Environmental Research Division
NOAA Southwest Fisheries Science Center
99 Pacific St., Suite 255A      (New!)
Monterey, CA 93940               (New!)
Phone: (831)333-9878            (New!)
Fax:   (831)648-8440
Email: bob.simons at noaa.gov

The contents of this message are mine personally and
do not necessarily reflect any position of the
Government or the National Oceanic and Atmospheric Administration.
<>< <>< <>< <>< <>< <>< <>< <>< <><
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20170309/850d1033/attachment-0001.html>


More information about the CF-metadata mailing list