causal-learn Accelerate DirectLiNGAM by parallelising causal ordering on GPUs with CUDA

This PR includes the implementation drastically speed-up (up to 32x on consumer GPU) DirectLiNGAM and its variants e.g VarLiNGAM.

The details are to allow for an optional dependency: https://github.com/Viktour19/culingam which implements custom CUDA kernels for the pairwise likelihood ratio causal ordering method.

The implementation has been tested locally on an NVIDIA RTX 6000 on a Linux machine - but tests on other setups are needed.

Mar 02 '24 01:03 aknvictor

Thanks, Victor. It looks great!

To make our dependencies as simple as possible, would it be possible to directly incorporate your modification into the causal-learn codebase?
Since the code of LiNGAM-based methods is the same as that in the LiNGAM package, it seems that some correctness issues are lingering in the PR there? (thanks @ikeuchi-screen for the review)

Mar 03 '24 20:03 kunwuz

Hi Yujia:

Directly incorporating it will introduce CUDA dependencies that are not needed for other algorithms, and potentially make the installation of causal-learn more complex. While it's possible to do so I'm not sure its the best option. Yes, the discussion in that PR are relevant so we may want to wait for that to be resolved before proceeding with this PR although since the issues are related to variance in setup - it may be useful for you to also test on your own setup as well.

Mar 03 '24 21:03 aknvictor