Functions returning large lists, like rustworkx.all_simple_paths, should have counterparts that only return list sizes

Open yurivict opened this issue 2 years ago • 1 comments

Information

rustworkx version: 0.13.0
Python version: 3.9
Rust version: 1.71.0
Operating system: FreeBSD 13.2

What is the current behavior?

In many situations when some function returns a large list there are also use models that only need the list size and not the whole list. For example, rustworkx.all_simple_paths() could have a counterpart rustworkx.count_all_simple_paths() that would only return the size of the list.

Or, even better, you can expose a function that would allow users to decide what to do with returned list values. For example, there can be a function:

def list_all_simple_paths(graph, from, to, fn, min_depth=None, cutoff=None)

which would call 'fn' (a lambda function) for each new path that it discovers.

In a C++ notation such function could look like this:

void listAllSimplePaths(
           Graph &graph, NodeId from, NodeId to,
           std::function<void(const Path &path)> fn,
           unsigned min_depth = 0, unsigned cutoff=0
)

It is sufficient to just provide the listing function like this, and then the user can decide what to do with returned data: either to fill it in the list or to just count it.

What is the expected behavior?

Not waste CPU to fill in the list that isn't ever used.

Steps to reproduce the problem

see above

Aug 01 '23 20:08 yurivict

In one use case of mine, I need to get the length distributions of simple paths. I think another way to do this is to return a Python generator instead of a list. So the current behavior would be achieved by result = list(all_simple_paths(graph)), but you can iterate easily without storing all paths, which grows extremely large extremely quickly.

It seems that this library does this already: https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.simple_paths.all_simple_paths.html

Aug 21 '23 10:08 pablolh