datatree icon indicating copy to clipboard operation
datatree copied to clipboard

Make illegal path-like variable names when constructing a DataTree from a Dataset

Open etienneschalk opened this issue 1 year ago • 1 comments

  • [x] Closes #311
  • [x] Tests added
  • [x] Passes pre-commit run --all-files ~~- [ ] New functions/methods are listed in api.rst~~
  • [ ] Changes are summarized in docs/source/whats-new.rst

Technical Note

Regarding Hashable vs str Dataset keys

Note: DataTree keys are Hashable. I only check for slashes in the variable names if they are instance of str. I never encountered a case (yet) where a Dataset keys are not str but Hashable in the broader case. We can imagine corner-cases where keys would be other types of Hashable, eg Path from pathlib

In [2]: from pathlib import Path

In [3]: hash(Path("/"))
Out[3]: -3809984204556177651

The choice I made is (1): only apply the check of slashes in the key if the key is an instance of str. Another choice (2)would be to project the Hashable space onto str space: str(variable_name) (1) seems more conservative than (2) as I do not pretend to be able to get a string representation for any Hashable.

etienneschalk avatar Feb 17 '24 12:02 etienneschalk

@etienneschalk sorry for missing this - do you want to re-submit this PR to xarray upstream?

TomNicholas avatar Aug 13 '24 15:08 TomNicholas