[CF-metadata] Extension of Discrete Sampling Geometries for Simple Features

Bob Simons - NOAA Federal bob.simons at noaa.gov
Fri Feb 3 09:47:07 MST 2017


1) There is a vague comment in the proposal about possibly changing the
point featureType. Please don't, unless the changes don't affect current
uses of Point. There are already 1000's of files that use it. If this new
system offers an alternative, then fine, it's an alternative. One of the
most important and useful features of a good standard is backwards
compatibility.

2) You advocate "Implement the WKT approach using a NetCDF binary array."
Is this system then an exact encoding of WKT, neither a subset nor a
superset?  "Simple Features" are often not simple.
If it is WKT (or something else), what is the standard you are following to
describe the Simple Features (e.g.,  ISO/IEC 13249-3:2016 and ISO
19162:2015)?
Does your proposal deviate in any way from the standard's capabilities?
Do you advocate following the entire WKT standard, e.g., supporting all the
feature types that WKT supports?

3) Since you are not using the WKT encoding, but creating your own, where
is the definition of the encoding system you are using?

4) This is a little out of CF scope, but:
Do you envision tools, notably, netcdf-c/java, having a writer function
that takes in WKT and encodes the information in a file, and having a
reader function that reads the file and returns WKT? Or is it your plan
that the encoding/ decoding is left to the user?

5) This proposal is for "Simple Features plus Time Series" (my phrase not
yours). But aren't there lots of other uses of Simple Features? Will there
be other proposals in the future for "Simple Features plus X" and "Simple
Features plus Y"? If so, will CF eventually become a massive document where
Simple Features are defined over and over again, but in different contexts?
If so, wouldn't a better solution be to deal with Simple Features
separately (as Postgres does by making a geometric data type?), and then
add "Simple Features plus Time Series" as the first use of it?

Thanks for answering these questions.
Please forgive me if I missed parts of your proposal that answer these
questions.


On Thu, Feb 2, 2017 at 5:57 AM, <cf-metadata-request at cgd.ucar.edu> wrote:

> Date: Thu, 2 Feb 2017 07:57:36 -0600
> From: David Blodgett <dblodgett at usgs.gov>
> To: <cf-metadata at cgd.ucar.edu>
> Subject: [CF-metadata] Extension of Discrete Sampling Geometries for
>         Simple  Features
> Message-ID: <224C2828-7212-449F-8C2C-97D903F6BE1E at usgs.gov>
> Content-Type: text/plain; charset="utf-8"
>
> Dear CF Community,
>
> We are pleased to submit this proposal for your consideration and review.
> The cover letter we've prepared below provides some background and
> explanation for the proposed approach. The google doc here <
> http://goo.gl/Kq9ASq> is an excerpt of the CF specification with track
> changes turned on. Permissions for the document allow any google user to
> comment, so feel free to comment and ask questions in line.
>
> Note that I?m sharing this with you with one issue unresolved. What to do
> with the point featureType? Our draft suggests that it is part of a new
> geometry featureType, but it could be that we leave it alone and introduce
> a geometry featureType. This may be a minor point of discussion, but we
> need to be clear that this is an issue that still needs to be resolved in
> the proposal.
>
> Thank you for your time and consideration.
>
> Best Regards,
>
> David Blodgett, Tim Whiteaker, and Ben Koziol
>
> Proposed Extension to NetCDF-CF for Simple Geometries
>
> Preface
>
> The proposed addition to NetCDF-CF introduced below is inspired by a
> pre-existing data model governed by OGC and ISO as ISO 19125-1. More
> information on Simple Features may be found here. <
> https://en.wikipedia.org/wiki/Simple_Features> To the knowledge of the
> authors, it is consistent with ISO 19125-1 but has not been specified using
> the formalisms of OGC or ISO. Language used attempts to hold true to
> NetCDF-CF semantics while not conflicting with the existing standards
> baseline. While this proposal does not support the entire scope of the the
> simple features ecosystem, it does support the core data types in most
> common use around the community.
>
> The other existing standard to mention is UGRID convention <
> http://ugrid-conventions.github.io/ugrid-conventions/>. The authors have
> experience reading and writing UGRID and have designed the proposed
> structure in a way that is inspired by and consistent with it.
>
> Terms and Definitions
>
> (Taken from OGC 06-103r4 OpenGIS Implementation Specification for
> Geographic information - Simple feature access - Part 1: Common
> architecture <http://www.opengeospatial.org/standards/sfa>.)
>
> Feature: Abstraction of real world phenomena - typically a geospatial
> abstraction with associated descriptive attributes.
> Simple Feature: A feature with all geometric attributes described
> piecewise by straight line or planar interpolation between point sets.
> Geometry (geometric complex): A set of disjoint geometric primitives - one
> or more points, lines, or polygons that form the spatial representation of
> a feature.
> Introduction
>
> Discrete Sampling Geometries (DSGs) handle data from one (or a collection
> of) timeSeries (point), Trajectory, Profile, TrajectoryProfile or
> timeSeriesProfile geometries. Measurements are from a point (timeSeries and
> Profile) or points along a trajectory. In this proposal, we reuse the core
> DSG timeSeries type which provides support for basic time series use cases
> e.g., a timeSerieswhich is measured (or modeled) at a given point.
>
> Changes to Existing CF Specification
>
> In NetCDF-CF 1.7, Discrete Sampling Geometries separate dimensions and
> variables into two types ? instance and element <
> http://cfconventions.org/cf-conventions/cf-conventions.
> html#_collections_instances_and_elements>. Instance refers to individual
> points, trajectories, profiles, etc. These would sometimes be referred to
> as features given that they are identified entities that can have
> associated attributes and be related to other entities. Element dimensions
> describe temporal or other dimensions to describe data on a per-instance
> basis. This proposal extends the DSG timeSeries featuretype <
> http://cfconventions.org/cf-conventions/cf-conventions.
> html#_features_and_feature_types> such that the geospatial coordinates of
> the instances can be point, multi-point, line, multi-line, polygon, or
> multi-polygon geometries. Rather than overload the DSG contiguous ragged
> array encoding, designed with timeseries in mind, a geometry ragged array
> encoding is introduced in a new section 9.3.5. See thi
>  s google doc for specific proposed changes. <http://goo.gl/Kq9ASq>
> Motivation
>
> DSGs have no system to define a geometry (polyline, polygon, etc., other
> than point) and an association with a time series that applies over that
> entire geometry e.g., The expected rainfall in this watershed polygon for
> some period of time is 10 mm. As suggested in the last paragraph of section
> 9.1, current practice is to assign a representative point or just use an ID
> and forgo spatial information within a NetCDF-CF file. In order to satisfy
> a number of environmental modeling use cases, we need a way to encode a
> geometry (point, line, polygon, multi-point, multi-line, or multi-polygon)
> that is the static spatial feature representation to which one or more
> timeSeries can be associated. In this proposal, we provide an encoding to
> define collections of simple feature geometries. It interfaces cleanly with
> the existing DSG specification, enabling DSGs and Simple Geometries to be
> used concurrently.
>
> Looking Forward
>
> This proposal is a compromise solution that attempts to stay consisten to
> CF ideals and fit within the structure of the existing specification with
> minimal disruption. Line and polygon data types often require variable
> length arrays. Development of this proposal has brought to light the need
> for a general abstraction for variable length arrays in NetCDF-CF. Such a
> general abstraction would necessarily be reusable for character arrays,
> ragged arrays of time series, and ragged arrays of geometry nodes, as well
> as any other ragged data structures that may come up in the future. This
> proposal does not introduce such a general ragged array abstraction but
> does not preclude such a development in the future.
>
> Three Alternative Approaches
>
> Respecting the human readability ideal of NetCDF-CF, the development of
> this proposal started from a human readable format for geometries known as
> Well Known Text <https://en.wikipedia.org/wiki/Well-known_text>. We
> considered three high level design approaches while developing this
> proposal.
>
> Direct use of Well-Known Text (WKT). In this approach, well known text
> strings would be encoded using character arrays following a contiguous
> ragged array approach to index the character array by geometry (or instance
> in DSG parlance).
> Implement the WKT approach using a NetCDF binary array. In this approach,
> well known text separators (brackets, commas and spaces) for multipoint,
> multiline, multipolygon, and polygon holes, would be encoded as break type
> separator values like -1 for multiparts and -2 for holes.
> Implement the fundamental dimensions of geometry data in NetCDF. In this
> approach, additional dimensions and variables along those dimensions would
> be introduced to represent geometries, geometry parts, geometry nodes, and
> unique (potentially shared) coordinate locations for nodes to reference.
> Selected Approach
>
> The first approach was seen as too opaque to stay true to the CF ideal of
> complete self-description. The third approach seemed needlessly verbose and
> difficult to implement. The second approach was selected for the following
> reasons:
>
> The second approach is just as or more human-readable than the third.
> Use of break values keeps geometries relatively atomic.
> Will be familiar to developers who are familiar with the WKT geometry
> format.
> Character arrays, which are needed for options one and three, are
> cumbersome to use in some programming languages in common use with NetCDF.
> Break values replace the need for extraneous variables related to
> multi-part and polygon holes (interiors). Multi-part geometries are
> generally an exception and excessive instrumentation to support them should
> be discounted.
> Example: Representation of WKT-Style Polygons in a NetCDF-3
> timeSeriesfeatureType
>
> Below is sample CDL demonstrating how polygons are encoded in NetCDF-3
> using a continuous ragged array-like encoding. There are three details to
> note in the example below.
>
> The attribute contiguous_ragged_dimension with value of a dimension in the
> file.
> The geom_coordinates attribute with a value containing a space separated
> string of variable names.
> The cf_role geometry_x_node and geometry_y_node.
> These three attributes form a system to fully describe collections of
> multi-polygon feature geometries. Any variable that has the
> continuous_ragged_dimension attribute contains integers that indicate the
> 0-indexed starting position of each geometry along the instance dimension.
> Any variable that uses the dimension referenced in the
> continuous_ragged_dimension attribute can be interpreted using the values
> in the variable containing the contiguous_ragged_dimension attribute. The
> variables referenced in the geom_coordinates attribute describe spatial
> coordinates of geometries. These variables can also be identified by the
> cf_roles geometry_x_node and geometry_y_node. Note that the example below
> also includes a mechanism to handle multi-polygon features that also
> contain holes.
>
> netcdf multipolygon_example {
> dimensions:
>   node = 47 ;
>   indices = 55 ;
>   instance = 3 ;
>   time = 5 ;
>   strlen = 5 ;
> variables:
>   char instance_name(instance, strlen) ;
>     instance_name:cf_role = "timeseries_id" ;
>   int coordinate_index(indices) ;
>     coordinate_index:geom_type = "multipolygon" ;
>     coordinate_index:geom_coordinates = "x y" ;
>     coordinate_index:multipart_break_value = -1 ;
>     coordinate_index:hole_break_value = -2 ;
>     coordinate_index:outer_ring_order = "anticlockwise" ;
>     coordinate_index:closure_convention = "last_node_equals_first" ;
>   int coordinate_index_start(instance) ;
>     coordinate_index_start:long_name = "index of first coordinate in each
> instance geometry" ;
>     coordinate_index_start:contiguous_ragged_dimension = "indices" ;
>   double x(node) ;
>     x:units = "degrees_east" ;
>     x:standard_name = "longitude" ; // or projection_x_coordinate
>     X:cf_role = "geometry_x_node" ;
>   double y(node) ;
>     y:units = "degrees_north" ;
>     y:standard_name = ?latitude? ; // or projection_y_coordinate
>     y:cf_role = "geometry_y_node"
>   double someVariable(instance) ;
>     someVariable:long_name = "a variable describing a single-valued
> attribute of a polygon" ;
>   int time(time) ;
>     time:units = "days since 2000-01-01" ;
>   double someData(instance, time) ;
>     someData:coordinates = "time x y" ;
>     someData:featureType = "timeSeries" ;
> // global attributes:
>     :Conventions = "CF-1.8" ;
>
> data:
>
>  instance_name =
>   "flash",
>   "bang",
>   "pow" ;
>
>  coordinate_index = 0, 1, 2, 3, 4, -2, 5, 6, 7, 8, -2, 9, 10, 11, 12, -2,
> 13, 14, 15, 16,
>     -1, 17, 18, 19, 20, -1, 21, 22, 23, 24, 25, 26, 27, 28, -1, 29, 30,
> 31, 32, 33,
>     34, -2, 35, 36, 37, 38, 39, 40, 41, 42, -1, 43, 44, 45, 46 ;
>
>  coordinate_index_start = 0, 30, 46 ;
>
>  x = 0, 20, 20, 0, 0, 1, 10, 19, 1, 5, 7, 9, 5, 11, 13, 15, 11, 5, 9, 7,
>     5, 11, 15, 13, 11, -40, -20, -45, -40, -20, -10, -10, -30, -45, -20,
> -30, -20, -20, -30, 30,
>     45, 10, 30, 25, 50, 30, 25 ;
>
>  y = 0, 0, 20, 20, 0, 1, 5, 1, 1, 15, 19, 15, 15, 15, 19, 15, 15, 25, 25,
> 29,
>     25, 25, 25, 29, 25, -40, -45, -30, -40, -35, -30, -10, -5, -20, -35,
> -20, -15, -25, -20, 20,
>     40, 40, 20, 5, 10, 15, 5 ;
>
>  someVariable = 1, 2, 3 ;
>
>  time = 1, 2, 3, 4, 5 ;
>
>  someData =
>   1, 2, 3, 4, 5,
>   1, 2, 3, 4, 5,
>   1, 2, 3, 4, 5 ;
> }
> How To Interpret
>
> Starting from the timeSeries variables:
>
> See CF-1.8 conventions.
> See the timeSeries featureType.
> Find the timeseries_id cf_role.
> Find the coordinates attribute of data variables.
> See that the variables indicated by the coordinates attribute have a
> cf_role geometry_x_nodeand geometry_y_node to determine that these are
> geometries according to this new specification.
> Find the coordinate index variable with geom_coordinates that point to the
> nodes.
> Find the variable with contiguous_ragged_dimension pointing to the
> dimension of the coordinate index variable to determine how to index into
> the coordinate index.
> Iterate over polygons, parsing out geometries using the contiguous ragged
> start variable and coordinate index variable to interpret the coordinate
> data variables.
> Or, without reference to timeSeries:
>
> See CF-1.8 conventions.
> See the geom_type of multipolygon.
> Find the variable with a contiguous_ragged_dimension matching the
> coordinate index variable?s dimension.
> See the geom_coordinates of x y.
> Using the contiguous ragged start variable found in 3 and the coordinate
> index variable found in 2, geometries can be parsed out of the coordinate
> index variable and parsed using the hole and break values in it.
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/
> attachments/20170202/4ce5b42f/attachment.html>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>
>
> ------------------------------
>
> End of CF-metadata Digest, Vol 166, Issue 3
> *******************************************
>



-- 
Sincerely,

Bob Simons
IT Specialist
Environmental Research Division
NOAA Southwest Fisheries Science Center
99 Pacific St., Suite 255A      (New!)
Monterey, CA 93940               (New!)
Phone: (831)333-9878            (New!)
Fax:   (831)648-8440
Email: bob.simons at noaa.gov

The contents of this message are mine personally and
do not necessarily reflect any position of the
Government or the National Oceanic and Atmospheric Administration.
<>< <>< <>< <>< <>< <>< <>< <>< <><
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20170203/0ccce75b/attachment-0001.html>


More information about the CF-metadata mailing list