Update to `cell_method` parsing
🚀 Pull Request
Description
Iris fails to load cubes from a netCDF file if the cell_method attribute is malformed and throws this slightly unhelpful error:
File /opt/scitools/conda/deployments/default-2024_02_20/lib/python3.11/site-packages/iris/fileformats/_nc_load_rules/helpers.py:267, in _split_cell_methods(nc_cell_methods)
265 for ii in range(len(name_start_inds) - 1):
266 method_indices.append((name_start_inds[ii], name_start_inds[ii + 1]))
--> 267 method_indices.append((name_start_inds[-1], len(nc_cell_methods)))
269 # Index the string and match against each substring
270 nc_cell_methods_matches = []
IndexError: list index out of range
For the netCDF files in my particular case, inspection of the stack trace reveals it was failing to correctly parse the cell_method attribute which was missing a space between the colon separator and time method. I.e.
cell_methods = "time:point"
rather than:
cell_methods = "time: point"
This PR makes two changes to the cell method parsing in fileformats/_nc_load_rules/helpers.py:
-
In
_split_cell_methods: If the value ofcell_methodsdoes not match the expectedname:pattern, aWarningis issued and the function returns an empty list so cube loading can continue. This is consistent with the behaviour further down the function if thename: valuepattern cannot be matched. -
Updates
_CM_PARSE_NAMEand_CM_PARSEregexes to make the pattern matching a little more lenient by making the space after the colon separator between thenameandmethodpair optional. The CF Convention states that the pairs are "blank-separated list ofname: methodpairs". It does show a space after the colon in the examples it gives, but the CF Compliance checker is happy with no space after the colon.
Unit tests have been updated to test that:
- Malformed
cell_methodraises a warning; e.g. name but no method, such ascell_method='time' - Space after colon between name and method is optional; e.g.
cell_method='time:mean'
Tested on Iris version 3.10.0.dev96.
Add any of the below labels to trigger actions on this PR:
- fixes: #6082
Modified slightly to address some failing unit tests. All tests passing locally for me now.
From @SciTools/peloton:
Related to #5165.
At the very least there needs to be an informative warning, otherwise what's the point in Iris being strict about CF.
Also, if the CF checker is OK with this, maybe we should be OK with it also? @pp-mo would love your opinion on the CF checker's place in these decisions.
It could be in this case that CF actually doesn't care about the space (why should it), but just that all the examples have one because it's a bit easier to read.
From @SciTools/peloton we are accepting this as CF-compliant.
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 89.82%. Comparing base (
d3071ff) to head (49bf592). Report is 73 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #6083 +/- ##
=======================================
Coverage 89.81% 89.82%
=======================================
Files 88 88
Lines 23181 23185 +4
Branches 4313 4314 +1
=======================================
+ Hits 20821 20825 +4
Misses 1628 1628
Partials 732 732
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Thanks @ukmo-ccbunney, changes all look good to me! I think it's worth just adding a small whatsnew entry for this, then it'll be good to go!
Done.