cmor icon indicating copy to clipboard operation
cmor copied to clipboard

Issue relating to warning message from CMOR

Open ehogan opened this issue 6 years ago • 7 comments

Good morning :)

The following warning message appears in some of our CMOR logs:

Warning: Invalid value(s) detected for variable 'vertices_latitude' (table: grids): 
1943040 values were greater than maximum valid value (90).Maximum encountered bad 
value (1e+30) was at (axis: index/value): j: 0/0 i: 0/0 vertices: 0/0

The coordinate variables in the input netCDF files all contain a _FillValue attribute, which is equal to 1.e_30f. This is equal to the value of the _FillValue attribute of the data variables in the input netCDF files and also to the value that is provided to cmor_variable.

In CMOR, where this warning is generated, the check of whether the values are greater/lower than the maximum/minimum value is being performed assuming the missing_value is 1.e20, rather than the value provided by cmor_variable.

Would it be possible to update CMOR to use the missing_value provided by cmor_variable so that these warnings aren't printed?

ehogan avatar Mar 22 '19 10:03 ehogan

Hi, why do you use _FillValue in coordinate variables, for what do you need it? Our DKRZ-QA always annotates that their should be no _FillValue in coordinates. CF defines it that way. Best regards, Fabi

wachsylon avatar Mar 26 '19 12:03 wachsylon

The CF conventions state that Missing data is allowed in data variables and auxiliary coordinate variables (see http://cfconventions.org/cf-conventions/cf-conventions.html#missing-data). This warning message occurs when using data containing auxiliary coordinate variables, so is a valid use case :)

ehogan avatar Mar 28 '19 09:03 ehogan

I have just noticed that we probably also have _FillValue when unstructured grids are used. Is this the same use case like yours?

wachsylon avatar Mar 28 '19 13:03 wachsylon

Ok I checked that. If the vertices of an unstructured grid vary from cell to cell, the model that I looked on just duplicates values for the unused indices of the vertices dimension.

But what is the better approach? Using _FillValue or this?

wachsylon avatar Mar 28 '19 13:03 wachsylon

@wachsylon @ehogan : I don't think CMOR3 can handle missing values in cell vertices although that is what should be done according to the output requirements document

If a cell has fewer than the maximum number of vertices, the remaining values should be 
set to 1.0d20, and missing_value and _FillValue attributes should be attached to both the 
latitude and longitude vertice variables and assigned the value 1.0d20. 

I suspect that the strategy of "just duplicating" values for the unused indices is a sensible approach. Any comments?

taylor13 avatar May 14 '19 23:05 taylor13

@wachsylon @ehogan We are developing specs for a remapping "weights" file where we also need to store cell vertices. Our guidance there is:

If a cell has fewer than nv_(a or b) vertices, then fill unneeded values by repeating one
 of the corners.

i.e. "duplicating values".

There was some discussion in the CF community during 2017 and 2018 on how to handle this, and folks reluctantly seemed to think defining _fillValue and missing_value was o.k., but no one was thrilled. For background, see the "grid cells with a varying number of cell bounds" thread starting here, but continuing in 2018. No one at that time came up with the "duplicating values" option. I have now suggested that alternative by raising an issue on the CF github page.

Do you think it would be o.k. to modify the CMIP6 guidance quoted above with the following?

If a cell has fewer than the maximum number of vertices, the remaining values should be filled 
with the last needed value, thereby duplicating the last vertex multiple times.  As an alternative, 
a now deprecated alternative approach can be followed (but this is not recommended): the 
unneeded vertice locations can be set to 1.0d20, and missing_value and _FillValue attributes 
should be attached to both the latitude and longitude vertice variables and assigned the value 
1.0d20. 

taylor13 avatar May 15 '19 21:05 taylor13

I prefer the _FillValue option. It seems to me that duplicating makes it harder for programs to check how many vertices there are for each cell. That could be relevant for any evaluation.

wachsylon avatar Aug 09 '19 06:08 wachsylon

This issue is stale, closing

durack1 avatar Apr 07 '24 16:04 durack1

The issue I opened on CF recently (here) was resolved by agreeing the CF would include this:

"For grids constructed from cells that do not all have the same number of sides (e.g., some rectangular cells and some triangular cells), the cell_bounds must be dimensioned to accommodate the maximum number of cell vertices. For cells with fewer than the maximum number of vertices, the unneeded elements in cell_bounds should be assigned the _FillValue."

The output requirements document document is already consistent with this. It states:

"If a cell has fewer than the maximum number of vertices, the remaining values should be set to 1.0d20, and missing_value and _FillValue attributes should be attached to both the latitude and longitude vertice variables and assigned the value 1.0d20."

So I agree that this can be closed.

taylor13 avatar Apr 08 '24 18:04 taylor13