iris icon indicating copy to clipboard operation
iris copied to clipboard

Update to `cell_method` parsing

Open ukmo-ccbunney opened this issue 1 year ago • 2 comments

🚀 Pull Request

Description

Iris fails to load cubes from a netCDF file if the cell_method attribute is malformed and throws this slightly unhelpful error:

File /opt/scitools/conda/deployments/default-2024_02_20/lib/python3.11/site-packages/iris/fileformats/_nc_load_rules/helpers.py:267, in _split_cell_methods(nc_cell_methods)
    265 for ii in range(len(name_start_inds) - 1):
    266     method_indices.append((name_start_inds[ii], name_start_inds[ii + 1]))
--> 267 method_indices.append((name_start_inds[-1], len(nc_cell_methods)))
    269 # Index the string and match against each substring
    270 nc_cell_methods_matches = []

IndexError: list index out of range

For the netCDF files in my particular case, inspection of the stack trace reveals it was failing to correctly parse the cell_method attribute which was missing a space between the colon separator and time method. I.e.

cell_methods = "time:point"

rather than:

cell_methods = "time: point"

This PR makes two changes to the cell method parsing in fileformats/_nc_load_rules/helpers.py:

  1. In _split_cell_methods: If the value of cell_methods does not match the expected name: pattern, a Warning is issued and the function returns an empty list so cube loading can continue. This is consistent with the behaviour further down the function if the name: value pattern cannot be matched.

  2. Updates _CM_PARSE_NAME and _CM_PARSE regexes to make the pattern matching a little more lenient by making the space after the colon separator between the name and method pair optional. The CF Convention states that the pairs are "blank-separated list of name: method pairs". It does show a space after the colon in the examples it gives, but the CF Compliance checker is happy with no space after the colon.

Unit tests have been updated to test that:

  1. Malformed cell_method raises a warning; e.g. name but no method, such as cell_method='time'
  2. Space after colon between name and method is optional; e.g. cell_method='time:mean'

Tested on Iris version 3.10.0.dev96.

Add any of the below labels to trigger actions on this PR:

  • fixes: #6082

ukmo-ccbunney avatar Jul 24 '24 14:07 ukmo-ccbunney

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Jul 24 '24 14:07 CLAassistant

Modified slightly to address some failing unit tests. All tests passing locally for me now.

ukmo-ccbunney avatar Jul 29 '24 16:07 ukmo-ccbunney

From @SciTools/peloton:

Related to #5165.

At the very least there needs to be an informative warning, otherwise what's the point in Iris being strict about CF.

Also, if the CF checker is OK with this, maybe we should be OK with it also? @pp-mo would love your opinion on the CF checker's place in these decisions.

It could be in this case that CF actually doesn't care about the space (why should it), but just that all the examples have one because it's a bit easier to read.

trexfeathers avatar Sep 11 '24 09:09 trexfeathers

From @SciTools/peloton we are accepting this as CF-compliant.

HGWright avatar Oct 09 '24 09:10 HGWright

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 89.82%. Comparing base (d3071ff) to head (49bf592). Report is 73 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #6083   +/-   ##
=======================================
  Coverage   89.81%   89.82%           
=======================================
  Files          88       88           
  Lines       23181    23185    +4     
  Branches     4313     4314    +1     
=======================================
+ Hits        20821    20825    +4     
  Misses       1628     1628           
  Partials      732      732           

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Oct 09 '24 09:10 codecov[bot]

Thanks @ukmo-ccbunney, changes all look good to me! I think it's worth just adding a small whatsnew entry for this, then it'll be good to go!

Done.

ukmo-ccbunney avatar Oct 18 '24 10:10 ukmo-ccbunney