[CF-metadata] Feedback requested on proposed CF Simple Geometries
chris.barker at noaa.gov
Tue Jan 31 10:52:52 MST 2017
A couple quick comments:
I think we're close here, so that's good. I'm not that clear on where tehre
are decisions left to be made, but I'll highlight two:
> Your aim is to
> describe the network alone.
> a collection of timeseries is stored as a
> data variable with a single dimension of time and a single dimension of
I don't see a conflict here -- if you can describe the network (geometry)
then you can associate data with it (UGRID used indexes into cells, nodes,
etc, this should be equally applicable)
> You would like to have SOMETHING alone in the file, just to
> describe the network itself. CF doesn't do this at present (domain without
isn't a set of coordinate variables essentially do that? i.e. you can
define a rectangular grid -- even if there is no data on it. And you can
certainly do that with UGRID, which is another standard, but I don't think
it conflicts with CF.
> Taking your previous comments into account (I'll come back to them below),
> a modified version of what I suggested before, here's a possible way to
> this case, for a small number (3) of linestrings:
That looks good to me, I think...
> SOMETHING=2, 4, 3;
> lon=0, 1, 0, -1, -2, -3, 2, 3, 4;
> lat=51, 52, 51, 50, 50, 49, 55, 55, 56;
I'm confused about what this is.
These simple geometries can be regarded as a more complex alternative to
> bounds - each timeseries has a complicated geometry of nodes and lines, but
> logically it's still a single "cell".
> For the sake of applications which can
> read CF but don't understand simple geometries, it might be a good idea in
> addition to provide a "representative" location for each timeseries, as
> representive_lat(station) and representative_lon(station), which could for
> instance be the mean of the node coordinates for each geometry.
We do that in UGRID, too -- I think it's even required (and called
coordinates, actually). It may make little sense with complex geometries,
but it can be handy.
> You propose the index variable in order for the convention to be like
> > ugrid. However this still seems to me to be an unnecessary complexity and
> > use of space if you aren’t going to have many shared nodes.
> To be frank, I'm not convinced by either argument. Regarding the first, in
> example you don't reuse any points at all. Can you give an example where
> is a lot of reuse?
The stream network example would be a good one. also things like political
boundaries -- they tend to be complex polygons with shared vertices.
> Regarding the second, I agree that it is a nuisance and
> unreliable to have to make comparisons with tolerance between
> numbers to determine equality. However, when you write a file, I suppose
> can and would write exactly the same numbers for the coordinates of a node
> it appears several times, wouldn't you? Thus the coincidence of nodes can
> tested by *exact* equality of coordinates - no tolerance needed.
you still don't know fo sure if the vertices are the SAME or if the Happen
to be the same.
This is a tough one -- the "normal" GIS data model does not have shared
nodes (that I know of) so perhaps we should follow that. But this lack of
shared nodes is actually a substantial pain for GIS systems and uses --
there is a lot of complex "snapping" that needs to be done. So I'm on the
fence about this -- I'm pretty convinced shared nodes are a better model,
but if we want to interact seamlessly with other GIS formats, we may be
better off matching that data model.
In my example above, I assumed the polygons have no holes in them, so I've
> omitted the inside/outside information. If needed, this information could
> be an attribute e.g. SOMETHING:inout="OIIIOOOOIOO", with as many elements
> there are polygons in total. Thinking again about it, I wonder whether this
> information is really needed. If you draw all the polygons, isn't it
> which ones are inside anyway? When would you use this information?
it's not always clear. if there is a hole in a polygon, you can figure it
out, but if there is a lake in a land polygon, and a island in the lake,
then it gets pretty tricky.
I think shapefiles use clockwise vs anti-clockwise to indicate
inside-outside, but IIUC, they are pretty limited with nested polygons, too.
> My scheme avoids the use of break values, which you're not very keen on
> selves, it sounds like.
I don't like break values either.
> You wrote > - It is more difficult to extract a single geometry using this
> approach. It's not hard, though, and the same comment would apply to the
> contiguous ragged array representation.
yes -- you can represent a ragged array by either specifying the
start-index of each "row", or by specifying the size of each row. CF
specifies the size of each row. I think that's a worse way to do it --
it's similar if you are looping through from the start, but much harder to
get an arbitrary row in the middle -- but I"ve gone with the the CF way for
other stuff  because it's better not to have two ways to do the same
thing. So we might as well stick with it here, too.
 a netcdf format for particle tracking model output:
Christopher Barker, Ph.D.
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the CF-metadata