miepython icon indicating copy to clipboard operation
miepython copied to clipboard

Parallelize for-loops using numba.prange

Open arunoruto opened this issue 1 year ago • 3 comments

Amazing work with this framework, and I am really happy to see that you used numba to speed up the computation. I was wondering if there is a reason why the prange function of numba has not been used to parallelize some for-loops. They have a small doc page about prange and its use.

By simply skimming the code, I found some possible places where prange could be used:

If there is an interest in implementing some prange functionality, I could try to make some changes. It would be nice also to have some standard tests, so the framework can be benchmarked and tested so that the output hasn't altered by the modifications made. Do you have any data for that currently?

arunoruto avatar Jun 28 '24 14:06 arunoruto

I am very interested in speeding things up with prange! I did not know it existed.

Some of the loops may not be amenable to parallelizing because earlier array elements affect later elements.

I would not worry about mie_mu_with_uniform_cdf() since I would be surprised if anyone is actually using it.

There are tests for accuracy in the tests folder. However there are not any tests for performance. The closest is the Jupyter notebook https://miepython.readthedocs.io/en/latest/11_performance.html . It should be straightforward to adapt some of that code to a test harness.

Looking forward to seeing a pull request.

scottprahl avatar Jun 29 '24 16:06 scottprahl

Hi Scott,

nice work! I am glad I found your code for mie calculation, very useful.

I have actually done the parallelization for my own needs. Please see the code in the test branch of my fork. I am willing to request a pull, if you find my implementation useful.

The way to do it is to create universal functions using @guvectorize decorator. As a bonus, the code gets vectorized automatically and run on multiple cores. For example, I can do

mu = np.linspace(0,1,1000) x = np.linspace(1,2,500) S1, S2 = S1_S2(2.,x,mu)

The resulting arrays will have shapes of (500,1000) and the computation is done in parallel if utilizing numba's target = "parallel" option.

Cheers,

Andrej

andrej5elin avatar Feb 21 '25 08:02 andrej5elin

This is very interesting. Thanks

One day, I will incorporate your idea. However, I want to get the E-field stuff released first before making everything faster.

scottprahl avatar Mar 16 '25 21:03 scottprahl