[HOLD] Re-implement distance_array using a batched algorithm and add compiled iterator layer [Core, WIP]
Fixes #3708
UPDATE
This PR is currently on hold pending decisions around batched distance arrays.
Aim:
- Add a batched distance array algorithm for efficient memory use.
- Allow distance routines to accept
NumPyarrays orAtomGroupsusign a compiled iterator.
Second large set of changes of CZIEOSS4 performance track. I have so far only changed _distance_array without a box but probably at a good stage to discuss now before moving on to change the rest of calc_distances.h
Changes made in this pull request:
- Introduces the
Iteratorlayer to homogenise behaviour betweenndarraysandAtomGroupsat the C++ level with a shared interface. - Allows distance routines to accept either
ndarrayorAtomGroupin any position using templates. This stops us having to write a huge number of overloads or use Cython polymorphism which is pretty annoying. For example_calc_dihedrals(a,b,c,d)would require 24 overloads for flexibility at every position. - Introduces a batched distance array algorithm that uses stack allocated arrays. Testing of the
C++withgooglebenchdemonstrated this to be very fast with a modest memory footprint. This concept also links in well with the structure indistopiaand possible someCUDAcode.
Status:
- [x] Finalise overhang part of batched distance array calculation (currently not working for
ncoords>batchsize) - [ ] Change remainder of
_calc_distances.h - [x] Finalise docs for iterator API
- [ ] Make sure types are consistent (
intvsuint64_tetc) - [ ] Add tests for
AtomGroupdistance calculations - [ ] Check performance with benchmarks.
Design Ideas:
We need a AtomGroup to behave like a NumPy array at the C++ level. We achieve this using C++ classes that provide a shared interface for iteration over float* coordinate arrays with different implementations for their specific datastructure.
These are then wrapped in a thin Cython class wrapper. AtomGroups or NumPy arrays are then turned into their respective iterators before being passed to the C++ level.
If people like this pattern, other groups can have iterators written very easily. They just need to share an interface with the others.
PR Checklist
- [ ] Tests?
- [X] Docs?
- [ ] CHANGELOG updated?
- [X] Issue raised/referenced?
Hello @hmacdope! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
- In the file
package/MDAnalysis/lib/distances.py:
Line 179:80: E501 line too long (151 > 79 characters) Line 236:80: E501 line too long (81 > 79 characters)
- In the file
package/MDAnalysis/lib/util.py:
Line 2063:80: E501 line too long (82 > 79 characters) Line 2072:80: E501 line too long (81 > 79 characters) Line 2073:80: E501 line too long (81 > 79 characters) Line 2079:80: E501 line too long (87 > 79 characters) Line 2089:33: E128 continuation line under-indented for visual indent Line 2122:80: E501 line too long (81 > 79 characters) Line 2126:80: E501 line too long (81 > 79 characters)
- In the file
package/setup.py:
Line 372:80: E501 line too long (91 > 79 characters) Line 453:26: E128 continuation line under-indented for visual indent Line 453:80: E501 line too long (89 > 79 characters) Line 455:80: E501 line too long (80 > 79 characters) Line 458:42: E251 unexpected spaces around keyword / parameter equals
- In the file
testsuite/MDAnalysisTests/lib/test_distances.py:
Line 380:1: W293 blank line contains whitespace Line 382:80: E501 line too long (82 > 79 characters) Line 392:80: E501 line too long (86 > 79 characters) Line 397:1: W293 blank line contains whitespace Line 398:62: E231 missing whitespace after ',' Line 399:29: E128 continuation line under-indented for visual indent Line 399:56: E231 missing whitespace after ',' Line 400:80: E501 line too long (89 > 79 characters) Line 401:80: E501 line too long (82 > 79 characters) Line 416:1: W293 blank line contains whitespace Line 417:1: W293 blank line contains whitespace Line 418:5: E303 too many blank lines (2) Line 425:80: E501 line too long (86 > 79 characters)
Comment last updated at 2022-06-20 06:34:19 UTC
Codecov Report
Merging #3718 (53c4b4e) into develop (a2f9e94) will decrease coverage by
0.76%. The diff coverage is79.72%.
@@ Coverage Diff @@
## develop #3718 +/- ##
===========================================
- Coverage 94.27% 93.50% -0.77%
===========================================
Files 192 188 -4
Lines 24695 24258 -437
Branches 3661 3753 +92
===========================================
- Hits 23280 22682 -598
+ Misses 1366 1061 -305
- Partials 49 515 +466
| Impacted Files | Coverage Δ | |
|---|---|---|
| package/MDAnalysis/core/group_iterators.pyx | 52.38% <52.38%> (ø) |
|
| package/MDAnalysis/lib/distances.py | 96.51% <78.94%> (-1.97%) |
:arrow_down: |
| package/MDAnalysis/lib/util.py | 89.43% <96.15%> (-1.41%) |
:arrow_down: |
| package/MDAnalysis/lib/c_distances.pyx | 98.11% <100.00%> (-1.89%) |
:arrow_down: |
| package/MDAnalysis/lib/c_distances_openmp.pyx | 95.38% <100.00%> (-4.62%) |
:arrow_down: |
| ...ckage/MDAnalysis/analysis/encore/confdistmatrix.py | 67.41% <0.00%> (-10.12%) |
:arrow_down: |
| package/MDAnalysis/analysis/encore/covariance.py | 84.61% <0.00%> (-9.24%) |
:arrow_down: |
| package/MDAnalysis/coordinates/DMS.py | 83.63% <0.00%> (-9.10%) |
:arrow_down: |
| package/MDAnalysis/coordinates/chemfiles.py | 88.46% <0.00%> (-8.80%) |
:arrow_down: |
| package/MDAnalysis/converters/OpenMM.py | 88.88% <0.00%> (-8.34%) |
:arrow_down: |
| ... and 97 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update a2f9e94...53c4b4e. Read the comment docs.
Some further benchmarking has shown that there doesn't appear a large speedup to be gained using a compiled iterator vs using a Python NumPy index call.
Given the large changes proposed in this PR for apparently limited gain I think I will switch tack and use a call to .positions. It's a shame to invest a fair bit into this design and then change tack but I think we needed to explore it fully to make sure it wasn't the right fit. Regardless lesson learnt about the speed of NumPy indexing.
The batching of coordinates using stack allocated memory as is done batched_distances.h does however hold some promise with a 2-8% speedup compared to current develop. It also fits more into the style we had in distopia but would be a large rewrite of calc_distance.h so perhaps best shelved for now.
With that in mind would people prefer I leave this here and open another PR to address the original issue (#3708 )? @richardjgowers or should I just continue here.
If there are no objections I will move the solution to the original issue (#3708) to a new PR and rename this PR accordingly? Will do tomorrow if no objections.
@richardjgowers and I moved away from this idea, I am going to close, feel free to re-open if we want to revive some aspect of this PR.