Grid virtual casing
Addressing issue https://github.com/PrincetonUniversity/SPEC/issues/32 with a faster virtual casing implementation. I iteratively increase the resolution to get an estimate of how well the integral has converged, and could adaptively refine the grid until a given epsilon is reached. For now it just prints the error estimate and doesn't terminate if gBnstol isn't reached.
For the rotating ellipse test case at a given accuracy, it is approximately 10x faster than the current implementation (serial performance) and parallelizes trivially across both MPI ranks and OMP threads. Comparing to DCUHRE results at 1e-12 tolerance, the accuracy estimates of the new method seem accurate, and it converges exponentially to the correct result as expected.
Changes to Inputlist
- New option
Lvcgridwas added to choose which virtual casing routine is being used. -
vcNt,vcNzthe resolution parameters for the new integration routine
Possible improvements
- Interpolate the precomputed surface currents when performing the integration, to have an "even finer grid" or possibly use an adaptive integration routine ontop of the interpolated values as @lazersos suggested.
- Possibly reuse the computed surface currents across multiple vc iterations, and only recompute the outer integral with the $1/(r^2)$ term.
@smiet Could you please also have a look at this?
Added further performance improvements (MPI Parallelization in addition to OpenMP).
Checked correctness of datasharing using MPI_Allreduce , the communication of the gBn terms using RlBroadcast is identical to the other casing() routine and wasn't investigated in detail. Since the resutls don't change with MPI I'm quite confident in the correctness.
Logging on 2 Ranks before and after allreduce
Entering precompute loop 1
rank 1 : Pbxyz = 0.0000000000000000 5.0083738809939513 0.0000000000000000 5.0083108465022459
rank 0 : Pbxyz = 5.0083817603806757 0.0000000000000000 5.0083502429341253 0.0000000000000000
rank (after reduction) 0 : Pbxyz = 5.0083817603806757 5.0083738809939513 5.0083502429341253 5.0083108465022459
rank (after reduction) 1 : Pbxyz = 5.0083817603806757 5.0083738809939513 5.0083502429341253 5.0083108465022459
rank 0 : kk = 0 , jk = 1
Both VC implementations exploit stellerator symmetry now to save 1/2 evaluations
I added an example config for the axysymmetric test case to the CI @jonathanschilling , it runs in about half the time and reaches the same result as the reference implementation (within tolerance). The expected speedup is higher the more fourier modes are used, and also with more OpenMP threads.
I think, as far as I can tell without much more effort, that this is ready for merging.
@jloizu Could you maybe also have a look and potentially try out this new feature on some of the free-boundary SPEC verification cases?
I think, as far as I can tell without much more effort, that this is ready for merging.
@jloizu Could you maybe also have a look and potentially try out this new feature on some of the free-boundary SPEC verification cases?
Trying out the Feature should only require adding
Lvcgrid=1 to numericslist. If tolerance isnt reached, increase the resolution in globallist
vcNt = 512
vcNz = 512
should be sufficient in Most cases
Just a bump @smiet , @jloizu to bring this back to attention, is there anything else you need from my side to merge?
I added Erol Balkovic to the reviewing, since he is currently testing free-boundary test cases. I pass the review torch to him and Chris, and then I am guessing it can be merged.
Thanks for the work @missing-user.
I checked your code and have also successfully ran an axisymmetric free-boundary problem input_file. The result matches what we had before and virtual casing calculation bnorml ran much faster. Just one small change would be to remove the macro COMPARECASING which I guess you used for testing.
Other than that, under the powers given by Joaquim to me, this pull request LGTM.