iris icon indicating copy to clipboard operation
iris copied to clipboard

POC: Handle std names and aliases (#5257)

Open larsbarring opened this issue 2 years ago • 9 comments

🚀 Pull Request

Description

This is a "proof-of-concept" PR to address how to better handle standard name aliases. It consists of the following elements:

  • Reorganising the iris.std_names.py a bit to have a separate dict for aliases (via updated tools/generate_std_names.py). It now includes some table version information (mentioned in #5255), and a separate dict for the standard name descriptions (optional when generated).
  • Adding a new std_name_table.py containing the following functions:
    • get_convention -- return a tentative Conventions string
    • set_alias_processing -- define how to handle aliases: "keep" - current behaviour, treat aliases in the same way as currently valid standard names, "warn" - issue a warning (default), otherwise as "keep", "replace" - silently update aliases to current standard names.
    • get_description -- return the standard name description if available
    • check_valid_standard_name -- check if a name is a standard name or an alias, and do the translation if requested as defined by set_alias_processing
  • std_name_table is [naively] imported in iris.__init__
  • common/mixin._get_valid_standard_name is modified to use check_valid_std_name

No units test have been added (would be good to first get some feedback whether this POC is a reasonable approach .... )


Consult Iris pull request check list

larsbarring avatar May 11 '23 10:05 larsbarring

Codecov Report

Attention: Patch coverage is 45.00000% with 33 lines in your changes are missing coverage. Please review.

Project coverage is 89.19%. Comparing base (a3931f6) to head (d2216e7). Report is 300 commits behind head on main.

Files Patch % Lines
lib/iris/std_name_table.py 35.29% 30 Missing and 3 partials :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5313      +/-   ##
==========================================
- Coverage   89.31%   89.19%   -0.13%     
==========================================
  Files          89       90       +1     
  Lines       22375    22430      +55     
  Branches     5368     5383      +15     
==========================================
+ Hits        19985    20007      +22     
- Misses       1640     1670      +30     
- Partials      750      753       +3     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar May 11 '23 10:05 codecov[bot]

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Apr 08 '24 16:04 CLAassistant

@SciTools/peloton is this still alive issue, and if so could you sign the CLA @larsbarring ?

pp-mo avatar May 01 '24 10:05 pp-mo

I thought I had already signed the CLA because of one or two previous minor contributions. Anyway, now done. Whether it is alive or not I am not sure. I have not made any further effort since back then as I am not sure whether it is a reasonable approach in the context of Iris.

larsbarring avatar May 01 '24 10:05 larsbarring

I thought I had already signed the CLA because of one or two previous minor contributions. Anyway, now done. Whether it is alive or not I am not sure. I have not made any further effort since back then ...

Ok thanks!

I am not sure whether it is a reasonable approach in the context of Iris.

Well, I guess we'll take a look + see about it @larsbarring are you at least clear that something like this would still be useful ?

pp-mo avatar May 01 '24 14:05 pp-mo

Yes, I think that this would still be useful. In the context of CF an aliased standard name is typically regarded as deprecated. Software should be able to read data having an aliased standard name, but new data should use the replacement name. Obviously, there are judgements to be made here regarding how to deal with this in practice. But and aliased standard name should not be considered just as an alternative at the same level as standard name.

Also, note that there will likely be some minor changes to the standard name xml file format as of next version, possibly also backported to all previous versions.

larsbarring avatar May 01 '24 16:05 larsbarring

A proper way to handle standard name aliases would also be useful for ESMValTool, see https://github.com/ESMValGroup/ESMValCore/issues/1985.

One issue we currently face is merging cubes with different standard names, which is (in the current iris version) not even allowed if the standard names are aliases of each other.

schlunma avatar May 02 '24 15:05 schlunma

@LisaBock this may be of interest to you.

bouweandela avatar Jun 06 '24 09:06 bouweandela

@SciTools/peloton We are planning to talk about this on the 3rd July.

HGWright avatar Jun 12 '24 09:06 HGWright