FastRP
FastRP copied to clipboard
Parallelise Scipy operation for faster run
Hi, I noticed that the models uses a sparse matrix dot product. For extremely large matrices in the order of millions of nodes (currently working with 5 million nodes, but need to work on 200+ million nodes), this operation is NOT able to run parallely on multiple CPUs simultaneously.
Could you suggest me know how to either:
- Parallelise this operation, or
- Any alternate solution to this part of the pipeline
I did a search but couldn't find an out-of-the-box solution unfortunately.
However, I find an implementation here: https://gist.github.com/simbaforrest/36230884d7fe43e3cd5f962dd0ebb662 didn't test it but maybe you can give a try