networkx.rb Are Node and Edge classes necessary?

The most simplest approach that comes in mind, while thinking of representing Nodes and Edges in a Graph, is via Ruby's inbuilt Hash data structure.

xml_document_1 = Nokogiri::XML::Document.new('randomstring1')
xml_document_2 = Nokogiri::XML::Document.new('randomstring2')
complex_node_1 = Nokogiri::XML::Node.new(xml_document_1, xml_document_1)
complex_node_2 = Nokogiri::XML::Node.new(xml_document_2, xml_document_2)

hash = {}
# NetworkX functionality like
# graph.add_edge(complex_node_1, complex_node_2)
# can be represented by,
hash[complex_node_1][complex_node_2] = 1

Nov 14 '17 14:11 athityakumar

If our aim is to follow the Python NetworkX, it shouldn't be. Any object can be a node; and an edge should be able to store the relation between two nodes, which could be an Array consisting two nodes and a Hash of attributes.

However, using this approach, lookup of an edge seems to become a significant burden, since, while we know node1 and node2, we do not always know **attr (ruling out use of the .include? method).

While we could alternatively use Hash to do edges[node1][node2] = 1, that does not store **attr. Maybe we could use edges[node1][node2] = attr instead, where if there are no extra attributes, we pass an empty Hash? That way we can also delete the key-value pair if we need to remove that edge. Using this method, if one had to remove a node and all connected edges, we would have to run edges.delete(node) and then iterate through the other nodes to see if they are connected to node.

Nov 30 '17 19:11 ajeetdsouza

I just went though some of the Python code, they have used a very similar approach, except they also have a separate dictionary of Node attributes. This is an unrelated feature, and can be added later as the project grows without much change to the existing implementation.

Nov 30 '17 19:11 ajeetdsouza

@ajeetdsouza - Can you link the Python NetworkX's documentation and code for the Node attributes here?

Dec 01 '17 00:12 athityakumar

NetworkX doc for add_node()

They explicitly say A node can be any hashable Python object except None. The reason they require a hashable object is again because they are using a dictionary for implementation.

The basic graph code is found here.

Dec 01 '17 03:12 ajeetdsouza

IMO the use of classes will help in making multigraphs and digraphs by inherent properties.

Dec 01 '17 04:12 rohitner

Hmm, yes. The Node class (or rather, module) seems required for a use-case like this,

>>> K3 = nx.Graph([(0, 1), (1, 2), (2, 0)])
>>> G.add_node(K3)
>>> G.number_of_nodes()
3

So, let's have separate Node and Edge classes and include them (to access their methods) in the different Graph classes. What say, @rohitner @ajeetdsouza?

Dec 01 '17 05:12 athityakumar

That use case can be resolved separately by adding methods to retrieve a list of nodes from a graph, and then calling G.add_nodes_from(K3.get_nodes()) or something similar. Ideally, G.add_node(K3) should be able to create a Graph, where each node is a Graph itself.

Dec 01 '17 08:12 ajeetdsouza

Treating each node as graph can be expensive considering the number of methods it will be referenced to. Why not port the code first and then introduce changes?

Dec 01 '17 12:12 rohitner

You're not treating it as a graph, you're treating it as a generic object - so you can essentially add a node of any (hashable) type. In the case mentioned by @athityakumar where you pass a graph to add_node, it should be able to treat even the graph as a generic object and add a node of type Graph.

Dec 01 '17 13:12 ajeetdsouza