datatree icon indicating copy to clipboard operation
datatree copied to clipboard

actually allow `DataTree` objects as values in `from_dict`

Open keewis opened this issue 3 years ago • 1 comments

The docstring of DataTree.from_dict claims that it allows having DataTree objects as values, but both the type hints and the code disagree (the code fails because DataTree(DataTree()) does not work).

The implementation I chose in this PR is to special-case DataTree objects to copy and orphan the node instead of passing it to the DataTree constructor.

Apparently this does not please mypy because while DataTree has a copy method, that is inherited from Dataset. The copy is only necessary because orphan works in-place (but then again you'd probably need copy to implement a side-effect free orphan).

  • [x] Tests added
  • [ ] Passes pre-commit run --all-files
  • [ ] New functions/methods are listed in api.rst
  • [ ] Changes are summarized in docs/source/whats-new.rst

For reference, I'm currently working around this limitation using

import posixpath

def normalize_tree_dict(mapping):
    def _normalize(mapping):
        for path, item in mapping.items():
            if not isinstance(item, datatree.DataTree):
                yield path, item
            else:
                root = item.copy()
                root.orphan()
                yield from ((posixpath.join(path, node.path), node.ds) for node in root.subtree)
            
    return dict(_normalize(mapping))

d = {"/path/to/node1": node1, "/path/to/node2": node2}
datatree.DataTree.from_dict(normalize_tree_dict(d))

keewis avatar Nov 09 '22 16:11 keewis

Thanks for catching this @keewis ! Looks like I forgot to write a unit test for this intended behaviour and so never implemented it :sweat_smile: I think your solution is neat though.

Apparently this does not please mypy because while DataTree has a copy method, that is inherited from Dataset

So we could just explicitly write out the copy method on DataTree? I think I originally had that in some branch but it caused a recursion problem, which was why I ended up just applying Dataset.copy with map_over_subtree. It would be better to have the method written out explicitly though.

TomNicholas avatar Nov 09 '22 16:11 TomNicholas