Are Node and Edge classes necessary?
The most simplest approach that comes in mind, while thinking of representing Nodes and Edges in a Graph, is via Ruby's inbuilt Hash data structure.
xml_document_1 = Nokogiri::XML::Document.new('randomstring1')
xml_document_2 = Nokogiri::XML::Document.new('randomstring2')
complex_node_1 = Nokogiri::XML::Node.new(xml_document_1, xml_document_1)
complex_node_2 = Nokogiri::XML::Node.new(xml_document_2, xml_document_2)
hash = {}
# NetworkX functionality like
# graph.add_edge(complex_node_1, complex_node_2)
# can be represented by,
hash[complex_node_1][complex_node_2] = 1
If our aim is to follow the Python NetworkX, it shouldn't be. Any object can be a node; and an edge should be able to store the relation between two nodes, which could be an Array consisting two nodes and a Hash of attributes.
However, using this approach, lookup of an edge seems to become a significant burden, since, while we know node1 and node2, we do not always know **attr (ruling out use of the .include? method).
While we could alternatively use Hash to do edges[node1][node2] = 1, that does not store **attr. Maybe we could use edges[node1][node2] = attr instead, where if there are no extra attributes, we pass an empty Hash? That way we can also delete the key-value pair if we need to remove that edge.
Using this method, if one had to remove a node and all connected edges, we would have to run edges.delete(node) and then iterate through the other nodes to see if they are connected to node.
I just went though some of the Python code, they have used a very similar approach, except they also have a separate dictionary of Node attributes. This is an unrelated feature, and can be added later as the project grows without much change to the existing implementation.
@ajeetdsouza - Can you link the Python NetworkX's documentation and code for the Node attributes here?
They explicitly say A node can be any hashable Python object except None. The reason they require a hashable object is again because they are using a dictionary for implementation.
The basic graph code is found here.
IMO the use of classes will help in making multigraphs and digraphs by inherent properties.
Hmm, yes. The Node class (or rather, module) seems required for a use-case like this,
>>> K3 = nx.Graph([(0, 1), (1, 2), (2, 0)])
>>> G.add_node(K3)
>>> G.number_of_nodes()
3
So, let's have separate Node and Edge classes and include them (to access their methods) in the different Graph classes. What say, @rohitner @ajeetdsouza?
That use case can be resolved separately by adding methods to retrieve a list of nodes from a graph, and then calling G.add_nodes_from(K3.get_nodes()) or something similar. Ideally, G.add_node(K3) should be able to create a Graph, where each node is a Graph itself.
Treating each node as graph can be expensive considering the number of methods it will be referenced to. Why not port the code first and then introduce changes?
You're not treating it as a graph, you're treating it as a generic object - so you can essentially add a node of any (hashable) type. In the case mentioned by @athityakumar where you pass a graph to add_node, it should be able to treat even the graph as a generic object and add a node of type Graph.