Parallelise Scipy operation for faster run

Open shivam1702 opened this issue 5 years ago • 1 comments

Hi, I noticed that the models uses a sparse matrix dot product. For extremely large matrices in the order of millions of nodes (currently working with 5 million nodes, but need to work on 200+ million nodes), this operation is NOT able to run parallely on multiple CPUs simultaneously.

Could you suggest me know how to either:

Parallelise this operation, or
Any alternate solution to this part of the pipeline

Aug 19 '20 08:08 shivam1702

I did a search but couldn't find an out-of-the-box solution unfortunately.

However, I find an implementation here: https://gist.github.com/simbaforrest/36230884d7fe43e3cd5f962dd0ebb662 didn't test it but maybe you can give a try

Aug 21 '20 00:08 GTmac