pathpyG icon indicating copy to clipboard operation
pathpyG copied to clipboard

inefficient calculation of temporal paths

Open IngoScholtes opened this issue 1 year ago • 1 comments

The current code to calculate temporal paths in a TemporalGraph suffers from explosive memory usage, which crashes the kernel. Also, despite using the GPU it is not efficient due to the intermediate calculation of all pairs of edges irrespective of their timestamps.

IngoScholtes avatar Apr 22 '24 16:04 IngoScholtes

I have written a new lift_order function that does not suffer from these issues, see notebook temporal_shortest_paths.ipynb in the tutorials folder in the linked branch temporal-paths-mem-efficient.

Side notes:

  • The for loop in the new implementation runs over all unique timestamps, which are typically orders of magnitude less than the number of edges
  • In the worst-case data set sociopatterns_highschool_2013, my code finishes in approx. 7 seconds for delta=300, while the current code takes more than 10 seconds to crash the kernel with an Out Of Memory error
  • I think the new code is much simpler and easier to understand and thus easier to maintain
  • The key idea is to not first create all possible second-order edges and then apply the time filter but rather move a time window over the data and then only create second-order edges for time-respecting paths of lengths two
  • The new implementation uses torch_cartesian_prod to generate all ordered pairs of source and target edge indices within the current time window, which might also be useful to suplify the general lift_order code

IngoScholtes avatar Apr 22 '24 16:04 IngoScholtes