POC: Handle std names and aliases (#5257)
🚀 Pull Request
Description
This is a "proof-of-concept" PR to address how to better handle standard name aliases. It consists of the following elements:
- Reorganising the iris.std_names.py a bit to have a separate dict for aliases (via updated
tools/generate_std_names.py). It now includes some table version information (mentioned in #5255), and a separate dict for the standard name descriptions (optional when generated). - Adding a new std_name_table.py containing the following functions:
-
get_convention-- return a tentative Conventions string -
set_alias_processing-- define how to handle aliases: "keep" - current behaviour, treat aliases in the same way as currently valid standard names, "warn" - issue a warning (default), otherwise as "keep", "replace" - silently update aliases to current standard names. -
get_description-- return the standard name description if available -
check_valid_standard_name-- check if a name is a standard name or an alias, and do the translation if requested as defined byset_alias_processing
-
- std_name_table is [naively] imported in
iris.__init__ -
common/mixin._get_valid_standard_nameis modified to usecheck_valid_std_name
No units test have been added (would be good to first get some feedback whether this POC is a reasonable approach .... )
Codecov Report
Attention: Patch coverage is 45.00000% with 33 lines in your changes are missing coverage. Please review.
Project coverage is 89.19%. Comparing base (
a3931f6) to head (d2216e7). Report is 300 commits behind head on main.
| Files | Patch % | Lines |
|---|---|---|
| lib/iris/std_name_table.py | 35.29% | 30 Missing and 3 partials :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## main #5313 +/- ##
==========================================
- Coverage 89.31% 89.19% -0.13%
==========================================
Files 89 90 +1
Lines 22375 22430 +55
Branches 5368 5383 +15
==========================================
+ Hits 19985 20007 +22
- Misses 1640 1670 +30
- Partials 750 753 +3
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@SciTools/peloton is this still alive issue, and if so could you sign the CLA @larsbarring ?
I thought I had already signed the CLA because of one or two previous minor contributions. Anyway, now done. Whether it is alive or not I am not sure. I have not made any further effort since back then as I am not sure whether it is a reasonable approach in the context of Iris.
I thought I had already signed the CLA because of one or two previous minor contributions. Anyway, now done. Whether it is alive or not I am not sure. I have not made any further effort since back then ...
Ok thanks!
I am not sure whether it is a reasonable approach in the context of Iris.
Well, I guess we'll take a look + see about it @larsbarring are you at least clear that something like this would still be useful ?
Yes, I think that this would still be useful. In the context of CF an aliased standard name is typically regarded as deprecated. Software should be able to read data having an aliased standard name, but new data should use the replacement name. Obviously, there are judgements to be made here regarding how to deal with this in practice. But and aliased standard name should not be considered just as an alternative at the same level as standard name.
Also, note that there will likely be some minor changes to the standard name xml file format as of next version, possibly also backported to all previous versions.
A proper way to handle standard name aliases would also be useful for ESMValTool, see https://github.com/ESMValGroup/ESMValCore/issues/1985.
One issue we currently face is merging cubes with different standard names, which is (in the current iris version) not even allowed if the standard names are aliases of each other.
@LisaBock this may be of interest to you.
@SciTools/peloton We are planning to talk about this on the 3rd July.