[CF-metadata] Storing multiple NWP model runs in a NetCDF - CFfile [SEC=UNCLASSIFIED]

Timothy Hume T.Hume at bom.gov.au
Thu Jan 8 16:09:34 MST 2009


Hi Doug,

How do you handle requests for "all the data from a particular model run", or "all the 24 hour forecasts from all runs"? I notice that your reftime and leadtime variables have repeated values, which can make it more difficult to do such commonly requested extractions. This is why I chose to use two dimensions. For example, if someone requests all the 24->72 hour forecasts from all the model runs between 00Z 1 January 2009 and 00Z 5 January 2009 I use the NetCDF operator "ncks" like this:

ncks -d basetime,"2009-01-01 00:00:00".,"2009-01-05 00:00:00". -d forecast,24.,72. bigfile.nc subset.nc

Is there an easy way to do such extractions using your format?

Cheers,

Tim.

________________________________

From: cf-metadata-bounces at cgd.ucar.edu [mailto:cf-metadata-bounces at cgd.ucar.edu] On Behalf Of Douglas Schuster
Sent: Friday, 9 January 2009 00:41
To: Timothy Hume
Cc: John Caron; cf-metadata at cgd.ucar.edu
Subject: Re: [CF-metadata] Storing multiple NWP model runs in a NetCDF - CFfile [SEC=UNCLASSIFIED]


Hi Tim, John, 

NCAR has been using a structure similar to option 2 for TIGGE based NetCDF files that is not
technically CF compliant, but was reviewed by colleagues at ECMWF, UK MetOffice, and BADC 
(Dec 10, 2007, ECMWF workshop) and deemed acceptable.  The structure is designed to handle multiple
ensemble forecast systems, and multiple init times with varying forecast periods.  


i.e.  

        int reftime(reftime) ;
                reftime:data_type = "int" ;
                reftime:units = "hours since 1950-01-01 00:00:00" ;
                reftime:standard_name = "forecast_reference_time" ;
                reftime:long_name = "Time of model initialization" ;
        int leadtime(reftime) ;
                leadtime:data_type = "int" ;
                leadtime:units = "hours" ;
                leadtime:standard_name = "forecast_period" ;
                leadtime:long_name = "Hours since forecast_reference_time" ;

The structure was based on 
the proposal by Francisco J. Doblas-Reyes for the "Ensembles" project.  A thread related to this proposal 
can be found at:


http://mailman.cgd.ucar.edu/pipermail/cf-metadata/2007/001470.html

A sample file is available for download at:

http://dss.ucar.edu/download/doug/data.nc

The metadata "CDL" dump of the file can be accessed at:

http://dss.ucar.edu/download/doug/data.dump

Doug


On Jan 7, 2009, at 7:19 PM, Timothy Hume wrote:


	Hi John,
	
	This seems a reasonable proposal to me. Often when storing NWP forecasts option (2) will suffice. Most centres will always include the same forecast lead times each time new NWP model data are disseminated (for example some centres will send out +6, +12, +18, +24, +36, +42, +48, +60 and +72 hour forecasts for two model runs per day, ever day). Data users will then often want to extract particular forecast lead times (e.g. +48 hours). Option (2) makes this a trivial task.
	
	On the other hand option (1) is also useful, even when the time offsets are fixed. For example, one can construct an auxilliary coordinate variable called valid_time which is two dimensional. This aids in the cases when the user wants to extract all forecasts which are valid at a particular time. In fact, using both options (1) and (2) often yields the most useful files.
	
	Tim.
	
	
	-----Original Message-----
	From: John Caron [mailto:caron at unidata.ucar.edu] 
	Sent: Thursday, 8 January 2009 12:56
	To: Karl Taylor
	Cc: Timothy Hume; cf-metadata at cgd.ucar.edu
	Subject: Re: [CF-metadata] Storing multiple NWP model runs in a NetCDF - CF file [SEC=UNCLASSIFIED]
	
	Hi Karl, Tim:
	
	I think the original wording doesnt fully anticipate multiple datetime coordinates:
	
	Section 4.4:
	" A time coordinate is identifiable from its units string alone. The Udunits routines utScan() and utIsTime() can be used to make this determination. Optionally, the time coordinate may be indicated additionally by providing the standard_name attribute with an appropriate value, and/or the axis attribute with the value T. "
	
	In Tim's example, his "forecast/valid time" is actually an offset from the base time, in units of time, not a udunit datetime. This allows it to be a 1D coordinate variable. But in the general case, if the forecast time spacing depends on the basetime, it must be 2D. We see this a lot in NCEP model output.
	
	In general, when storing multiple forecast model runs in the same file there are two options:
	
	1. A 1D "basetime/runtime" coordinate variable identified by standard name "forecast_reference_time" which holds udunit datetimes, and a 2D "forecast/valid time" auxiliary coordinate variable with standard name "time" which also holds udunit datetimes. It must be dimensioned by the runtime and the time dimensions. Data variables will need to use the "coordinates" attribute to reference the time auxiliary coordinate, as usual.
	
	2. A 1D "basetime/runtime" coordinate variable identified by standard name "forecast_reference_time" which holds udunit dates, and a 1D or 2D "forecast offset time" auxiliary coordinate variable identified by standard name "forecast_period" which holds a udunit time unit (eg hours), which is added to the basetime to get the forecast datetime. If 1D, it will be dimensioned by time, if 2d it will be dimensioned by runtime and time, and must be referenced by the "coordinates" attribute
	
	As an aside, the utIsTime() function is ambiguous as to whether we have a time unit or a datetime unit (which udunits calls "having an offset"). I think we need to carefully distinguish time and datetime, and specify where each is allowed.
	
	If this seems reasonable, I can write up a proposal.
	
	
	Karl Taylor wrote:
	

		Hi Tim,
		


		Section 4.4 (just before section 4.1) states "The methods of identifying
		

		coordinate types described in this section apply both to coordinate
		

		variables and to auxiliary coordinate variables named by the coordinates
		

		attribute."  Since the axis attribute is one of the methods used to
		

		identify a vertical coordinate, it would seem to be allowed for the
		

		level in your file.
		


		On the other hand, in the 4th paragraph of section 5.7, it says that
		

		"The axis attribute is not allowed for auxiliary coordinate variables".
		

		This seems to contradict the earlier statement.
		


		We need to revise the document to be internally consistent.  Does anyone
		

		recall why we would want to prohibit the use of the axis attribute for
		

		auxiliary coordinate variables?
		


		cheers,
		

		Karl
		


		Timothy Hume wrote:
		

			Hi Karl,
			


			I have just checked one of my files, and get two errors:
			


			The first error is:
			


			------------------
			

			Checking variable: forecast
			

			------------------
			

			ERROR (4.4): Invalid units and/or reference time
			


			This is the error I am aware of. My file should become compliant if I
			

			switch the "T" axis to basetime.
			



			The second error is:
			


			------------------
			

			Checking variable: level
			

			------------------
			

			ERROR (4): Axis attribute is not allowed for auxillary coordinate
			

			variables.
			


			I was unaware that the axis attribute should not be used for scalar
			

			coordinate variables (as describe in Section 5.7). Is this intended?
			

			In any case, I don't think this small error should cause the NetCDF
			

			viewers I tried (IDV and Joe Sirott's web application) to not work.
			


			Cheers,
			


			Tim.
			


			-----Original Message-----
			

			From: Karl Taylor [mailto:taylor13 at llnl.gov] Sent: Thursday, 8 January
			

			2009 11:03
			

			To: Timothy Hume
			

			Cc: cf-metadata at cgd.ucar.edu
			

			Subject: Re: [CF-metadata] Storing multiple NWP model runs in a NetCDF
			

			- CF file [SEC=UNCLASSIFIED]
			


			Hi Tim,
			


			Other than the axis attribute for time, I didn't see any issues.  You
			

			might run the CF compliance checker on the file (http://
			

			cf-pcmdi.llnl.gov/conformance), but it might not run if the other
			

			utilities stumbled.  Maybe someone else has some ideas.
			


			cheers,
			

			Karl
			



			Timothy Hume wrote:
			

				Hi,
				


				I am writing NetCDF files which hold surface fields (2m temperature
				

				etc) from multiple NWP model runs (all the runs from the same model
				

				for the last month or so). The files are for operational use, so I
				

				want them to strictly follow the CF conventions. I am running into a
				

				couple of problems, where the conventions don't seem to be ideally
				

				suited for storing more than a single model run.
				


				Here is what I do:
				


				I have four dimensions and associated coordinate variables:
				


				basetime:    The base time for the model run (units: seconds since
				

				1970-01-01 00:00:00.0 +0:00)
				

				forecast:    The forecast lead time, relative to the basetime (units:
				

				hours)
				

				latitude
				

				longitude
				


				There is no need for a vertical dimension, because I am using surface
				

				fields. Never-the-less I make use of a scalar vertical coordinate
				

				variable as described in Section 5.7 of the CF-1.3 metadata
				

				conventions document.
				


				The use of two dimensions to store the time information (basetime and
				

				forecast) seems to be a natural way to store multiple NWP model runs,
				

				and is the standard way used in the very old NUWG conventions. As far
				

				as I can tell, what I am doing is supported by the CF conventions,
				

				provided the time axis is taken to be the basetime dimension. This is
				

				because Section 4.4 of the conventions specifies the units of the
				

				time coordinate must include a reference time. Obviously the units of
				

				the forecast coordinate cannot include a reference time, because the
				

				reference time varies, and is determined by the value of the basetime
				

				coordinate.
				


				The difficulty I am encountering is that some applications which read
				

				NetCDF/CF files (such as the Unidata IDV and Joe Sirott's very nice
				

				web based NetCDF data viewer) seem to choke on my data. I suspect
				

				(but am not 100% certain in the case of the IDV) that the reason is
				

				because of the way I handle the time information in my files.
				


				To illustrate in more detail what my files look like, I am attaching
				

				the CDL from an example file. The CDL is non-CF compliant, because I
				

				have specified the "T" axis (via the axis variable attribute) to be
				

				the forecast coordinate, and the forecast coordinate has invalid
				

				units (no reference time). I am planning on switching the "T" axis to
				

				the base time coordinate, which as far as I can determine should make
				

				the file CF compliant.
				


				My question is: is there a better (or more standard) way of storing
				

				multiple NWP model runs in a single file than what I am doing?
				

				Cheers,
				


				Tim Hume
				

				Centre for Australian Weather and Climate Research
				

				Australian Bureau of Meteorology
				

				Melbourne
				

				Australia
				


				--- Example CDL follows ---
				


				netcdf gasp_1p0deg_ocf_t2m_rtdb_opn {
				

				dimensions:
				

				   forecast = 69 ;
				

				   basetime = UNLIMITED ; // (100 currently)
				

				   latitude = 96 ;
				

				   longitude = 121 ;
				

				   bounds = 2 ;
				

				variables:
				

				   double forecast(forecast) ;
				

				       forecast:long_name = "Time of model forecast, relative to the
				

				basetime" ;
				

				       forecast:units = "hours" ;
				

				       forecast:standard_name = "forecast_period" ;
				

				       forecast:axis = "T" ;
				

				       forecast:bounds = "forecast_bounds" ;
				

				   double forecast_bounds(forecast, bounds) ;
				

				       forecast_bounds:long_name = "forecast interval" ;
				

				       forecast_bounds:units = "hours" ;
				

				   int basetime(basetime) ;
				

				       basetime:long_name = "Model basetime" ;
				

				       basetime:units = "seconds since 1970-01-01 00:00:00.0 +0:00" ;
				

				       basetime:calendar = "gregorian" ;
				

				       basetime:standard_name = "forecast_reference_time" ;
				

				   double latitude(latitude) ;
				

				       latitude:long_name = "latitude" ;
				

				       latitude:units = "degrees_north" ;
				

				       latitude:bounds = "latitude_bounds" ;
				

				       latitude:valid_min = -90. ;
				

				       latitude:valid_max = 90. ;
				

				       latitude:standard_name = "latitude" ;
				

				       latitude:axis = "Y" ;
				

				   double latitude_bounds(latitude, bounds) ;
				

				       latitude_bounds:long_name = "grid cell latitude boundaries" ;
				

				       latitude_bounds:units = "degrees_north" ;
				

				       latitude_bounds:valid_min = -90. ;
				

				       latitude_bounds:valid_max = 90. ;
				

				   double longitude(longitude) ;
				

				       longitude:long_name = "longitude" ;
				

				       longitude:units = "degrees_east" ;
				

				       longitude:bounds = "longitude_bounds" ;
				

				       longitude:valid_min = -360. ;
				

				       longitude:valid_max = 360. ;
				

				       longitude:standard_name = "longitude" ;
				

				       longitude:axis = "X" ;
				

				   double longitude_bounds(longitude, bounds) ;
				

				       longitude_bounds:long_name = "grid cell longitude boundaries" ;
				

				       longitude_bounds:units = "degrees_east" ;
				

				       longitude_bounds:valid_min = -360. ;
				

				       longitude_bounds:valid_max = 360. ;
				

				   float temperature_2m(basetime, forecast, latitude, longitude) ;
				

				       temperature_2m:long_name = "Air temperature 2m above the
				

				surface" ;
				

				       temperature_2m:units = "K" ;
				

				       temperature_2m:_FillValue = 9.96921e+36f ;
				

				       temperature_2m:missing_value = 9.96921e+36f ;
				

				       temperature_2m:valid_min = 180.f ;
				

				       temperature_2m:valid_max = 330.f ;
				

				       temperature_2m:standard_name = "air_temperature" ;
				

				       temperature_2m:cell_methods = "lat: lon: mean (area weighted)" ;
				

				       temperature_2m:coordinates = "level" ;
				

				   double level ;
				

				       level:long_name = "Height above the surface" ;
				

				       level:units = "m" ;
				

				       level:positive = "up" ;
				

				       level:standard_name = "height" ;
				

				       level:axis = "Z" ;
				


				// global attributes:
				

				       :Conventions = "CF-1.3" ;
				

				       :history = "File created by the Gridded OCF data ingest
				

				system" ;
				

				       :institution = "Australian Bureau of Meteorology" ;
				

				       :source = "model" ;
				

				       :title = "GASP forecasts of temperature_2m; resolution: 1.0
				

				degree; source: rtdb" ;
				

				       :topography = "MSAS" ;
				

				}
				

				_______________________________________________
				

				CF-metadata mailing list
				

				CF-metadata at cgd.ucar.edu
				

				http://  mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
				







		_______________________________________________
		

		CF-metadata mailing list
		

		CF-metadata at cgd.ucar.edu
		

		http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
		

	_______________________________________________
	CF-metadata mailing list
	CF-metadata at cgd.ucar.edu
	http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
	




More information about the CF-metadata mailing list