Add a more general interface to manipulate importing/exporting
Exporting and reloading a tree data is a general scenario. This feature aims to provide a unified interface to process different data format, including (without limitation on more options):
- Json: as in #75, #78 and #73
- Graphviz dot format: as in
plugins/export_to_dot - Yaml: to be added
Did anything come of this? Is there a way to population a tree in treelib with a JSON? Thanks.
Would a general interface not just mean a couple of:
-
to_dict(),to_json(),to_graphviz(), ... instance methods. - along with
from_dict(),from_json(),from_graphviz(), ... classmethods.
Most of these methods exist already, so you'd just have to name them properly. Furthermore they should have the same signature.
Or do you think about putting these functions into new modules? For example:
-
treelib.save.to_dict(tree, ...),treelib.save.to_json(tree, ...),treelib.save.to_graphviz(tree, ...) -
treelib.load.from_dict(dict_),treelib.load.from_json(json_file),treelib.load.from_graphviz(dot_file)
I could work on that, if you need help.
I'll drop another exporting function here, just in case somebody wants to do the same. I wanted to convert a tree into binary tree, using the left-child-right-sibling method. As treelib can't distinguish between left and right childs I used the binarytree package.
import binarytree as bt
def to_left_child_right_sibling(tree: tl.Tree) -> Tuple[bt.Node, Dict[int, str]]:
""" Converts a treelib.Tree object to a binarytree.
The binarytree package is used for storing the new LCRS-binary tree, as
Treelib trees can't distinguish between left and right children. The
binarytree.Node class expects numeric node values (identifiers), the
tags/labels/names of the nodes are returned in a dictionary.
"""
def to_lcrs(tree: tl.Tree, root_id: int = None) -> bt.Node:
"""Recursivly constructs a lcrs tree starting from node at root_id"""
if root_id is None:
root_id = tree.root
# construct a root node
root = bt.Node(root_id)
# if it does not have any children, we return it (recursion end)
if not tree.children(root_id):
return root
# otherwise we recursivly construct lcrs trees of every child ...
sub_trees = [to_lcrs(tree, child_id) for child_id in tree[root_id].fpointer]
# ... and link them together as right childs
for i in range(1, len(sub_trees)):
sub_trees[i - 1].right = sub_trees[i]
# the first lcrs tree is now the left child of our root
root.left = sub_trees[0] if len(sub_trees) > 0 else None
return root
id2name = {i: node.tag for i, node in tree.nodes.items()}
root = to_lcrs(tree)
return root, id2name
are there function to load json data to tree yet ?
There are 3 types of information that should be stored to serialize/deserialize a tree instance: tree information, node information, nodes hierarchy.
More specifically:
-
treeidentifier -
node"hierarchy" (nodesbpointer/fpointers) -
nodebase attributes:tag,identifier -
nodedata(requires contraints since some objects aren't serializable: eg pythonsetfor json serialization) Then in case of inheritance: -
treenode_classin case of node class inheriting fromtreelib.Node -
treeother attributes in case of tree class inheriting fromtreelib.Tree -
nodeother attributes in case of node class inheriting fromtreelib.Node
Without going into the details of a specific output format, an approach allowing inheritance could be to have distinct methods that can be overriden:
-
treelib.Tree_serialize_metadatamethod, serializingtreeinformation (identifier, tree other attributes in case of inheritance) -
treelib.Tree_serialize_hierarchymethod, serializing hierarchy (extracted from bpointer/fpointers) -
treelib.Node_serialize_nodemethod, serializing node information (tag,identifier,dataetc)
Note: for those not requiring a specific serialization format, consider using python pickle module: https://docs.python.org/3/library/pickle.html
I think it would be appropriate to implement right away https://github.com/caesar0301/treelib/issues/95 (ability to export to stream) into the solution of this issue. @villmow are you still interested into working on that subject or do you need help?
I didn't know the graphviz dot format, but from what I understand I think we shouldn't try to handle this in the same way that yaml json formats, since it is much less generic.
For json/yaml and such, we could have some kind of common _export method, whose goal would be to provide a serializable python object, and then apply either a JSON or YAML serializer.
@caesar0301 before I go further and implement the json/yaml serialization with stream output, do you have an opinion on this design: https://github.com/caesar0301/treelib/pull/133
Hello, has from_json or anything similar been implemented yet? since this request is still open I assume no?
https://anytree.readthedocs.io/en/latest/index.html