Bug with relative_to method
This should work
In [1]: from datatree import DataTree
In [2]: dt = DataTree.from_dict({"a": None, "a/b": None, "c": None})
In [3]: dt
Out[3]:
DataTree('None', parent=None)
├── DataTree('a')
│ └── DataTree('b')
└── DataTree('c')
In [4]: dt["c"]
Out[4]: DataTree('c', parent="None")
In [5]: c = dt["c"]
In [6]: c.relative_to(dt)
---------------------------------------------------------------------------
TreeIsomorphismError Traceback (most recent call last)
Input In [6], in <cell line: 1>()
----> 1 c.relative_to(dt)
File ~/Documents/Work/Code/datatree/datatree/treenode.py:554, in NamedNode.relative_to(self, other)
549 raise ValueError(
550 "Cannot find relative path because nodes do not lie within the same tree"
551 )
553 this_path = NodePath(self.path)
--> 554 if other in self.lineage:
555 return str(this_path.relative_to(other.path))
556 else:
File ~/Documents/Work/Code/datatree/datatree/mapping.py:171, in map_over_subtree.<locals>._map_over_subtree(*args, **kwargs)
167 raise TypeError("Must pass at least one tree object")
169 for other_tree in other_trees:
170 # isomorphism is transitive so this is enough to guarantee all trees are mutually isomorphic
--> 171 check_isomorphic(
172 first_tree, other_tree, require_names_equal=False, check_from_root=False
173 )
175 # Walk all trees simultaneously, applying func to all nodes that lie in same position in different trees
176 # We don't know which arguments are DataTrees so we zip all arguments together as iterables
177 # Store tuples of results in a dict because we don't yet know how many trees we need to rebuild to return
178 out_data_objects = {}
File ~/Documents/Work/Code/datatree/datatree/mapping.py:71, in check_isomorphic(a, b, require_names_equal, check_from_root)
68 diff = diff_treestructure(a, b, require_names_equal=require_names_equal)
70 if diff:
---> 71 raise TreeIsomorphismError("DataTree objects are not isomorphic:\n" + diff)
TreeIsomorphismError: DataTree objects are not isomorphic:
Number of children on node '/c' of the left object: 0
Number of children on node '/' of the right object: 2
in this case it should return "../", like the unix-style filesystem syntax.
The error is weird and I'm having trouble debugging it quickly: it should not be performing any isomorphism check for this operation - I don't know why it's jumping to that part of the code.
I suspect it's something to do with __eq__ being wrapped in ops.py to compare two trees nodewise. That's still odd though as I thought obj in iterable should perform the comparison using is, not ==.
There is also a problem with the testing - there is a test for this in the codebase but it erroneously passes. The current test (test_treenode.py::TestPaths.test_relative_paths) however tests a subclass from which DataTree inherits, rather than testing DataTree itself. Clearly we also need to change this test (and possibly others) to instead test the public object.
I fixed a couple of small bugs relating to this (#160), but it has also revealed a bigger design problem: for node in tree.lineage tries to check presence using equality, but DataTree objects are non-hashable, and __eq__ produces boolean arrays elementwise, which doesn't work in this case.
design problem
I spoke about this with Stephan and Justus a while ago, and I think we came to the conclusion that it's fine to just do (1) check if roots are the same (so that node is part of same tree) (2) check for presence just by looking at names / data
I need to come back to this though.