pathpyG icon indicating copy to clipboard operation
pathpyG copied to clipboard

Migration Issues Random Walks

Open M-Lampert opened this issue 1 year ago • 1 comments

As far as I am aware, we reused the PathPy3 code for the implementation of the random walks. I noticed the following problems that need to be fixed:

  • [x] Docstrings in the wrong format (fixed in #189)
  • [x] Bug in transition_probabilities(...) (fixed in #189)
  • [ ] Documentation out of date. E.g. the example uses rw.plot(...) which does not work in pathpyG as far as I know.
  • [ ] Bug in HigherOrderRandomWalk.first_order_stationary_state(...): v.relations[-1] throws an exception since each node is an id (tuple or string) and not a node object as in pathpy3.
  • [ ] TODOs in process.py: More than half of the file is commented out and marked as TODO.

There are probably some more problems that I haven't noticed so far, so I propose some additional unit-tests for each file in processes/:

  • [ ] Tests for process.py
  • [ ] Additional tests for random_walk.py
  • [ ] Deprecate sampling.py (see #217 )

M-Lampert avatar Jul 01 '24 13:07 M-Lampert

There is a PyTorch-Geometric-based random walk implementation: https://github.com/rusty1s/pytorch_cluster?tab=readme-ov-file#randomwalk-sampling Current implementation:

>>> import pathpyG as pp
>>> g = pp.Graph.from_edge_list([['a','b'], ['b','c'], ['c','a'], ['c','d'], ['d','a']])
>>> rw = pp.processes.RandomWalk(g)
>>> data = rw.run_experiment(steps=100, runs=100)
>>> %timeit rw.run_experiment(steps=100, runs=100)
123 ms ± 3.87 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

PyG-based random walk:

>>> import torch
>>> from torch_cluster import random_walk

>>> seeds = torch.randint(0, g.N, (100,))
>>> row, col = g.data.edge_index
>>> %timeit random_walk(row, col, seeds, walk_length=100)
161 µs ± 6.58 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

It might make sense to use this implementation instead. Note that this implementation does not allow a restart probability but instead can be a biased random walk with p and q.

M-Lampert avatar Nov 04 '24 09:11 M-Lampert

With #299, we are now planning to make a full redesign of this module, making this issue obsolete. Existing problems will be removed from the latest version of PathpyG with #300 (issue #293)

M-Lampert avatar Nov 07 '25 10:11 M-Lampert