CSRGraph icon indicating copy to clipboard operation
CSRGraph copied to clipboard

Faster sampling with Vose's alias method

Open MaxHalford opened this issue 5 years ago • 2 comments

Hey!

This is not an issue but rather a suggestion :).

From what I understand, you walk on a graph by randomly hopping from one node to another. You do this via a roulette wheel method that runs in a n + log(n) time: n for building the wheel and log(n) for the subsequent binary search. There's an O(1) way to do via Vose's alias method (this is definitely worth a read). I implemented this in Cython here. Naturally this costs a bit of memory, but it might be worth it, I don't know.

Just wanted to make sure you knew about this!

PS: great article on challenging GNNs, that's how I found you :)

MaxHalford avatar Jan 05 '21 23:01 MaxHalford

Hey Max,

Thanks for the comment!

In our case it's n + log(n) but n is the number of edges for a node. So unless we're working on graphs with a huge average degree, considerations like cache locality and efficient implementation matters most for performance (since n < 50 in most cases we see).

That said, I'll keep the issue open so I can look into benchmarking your proposed method against my "naive" method

VHRanger avatar Jan 06 '21 02:01 VHRanger

Just to track improvements to the random walk sampling in this issue, it should be possible to accelerate the node2vec walks with this:

https://louisabraham.github.io/articles/node2vec-sampling.html

VHRanger avatar Feb 17 '21 15:02 VHRanger