Accessible Surface Area calculations
Fixes #2439
Changes made in this Pull Request:
- Added calculation of the accessible surface area using Shrake-Rupley algorithm (modified #4025).
- Added calculation of relative accessible surface area.
PR Checklist
- [x] Tests?
- [x] Docs?
- [x] CHANGELOG updated?
- [x] Issue raised/referenced?
Developers certificate of origin
- [x] I certify that this contribution is covered by the LGPLv2.1+ license as defined in our LICENSE and adheres to the Developer Certificate of Origin.
📚 Documentation preview 📚: https://mdanalysis--4417.org.readthedocs.build/en/4417/
Hello @JureCerar! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
There are currently no PEP 8 issues detected in this Pull Request. Cheers! :beers:
Comment last updated at 2024-07-09 22:52:39 UTC
Linter Bot Results:
Hi @JureCerar! Thanks for making this PR. We linted your code and found the following:
Some issues were found with the formatting of your code.
| Code Location | Outcome |
|---|---|
| main package | ⚠️ Possible failure |
| testsuite | ⚠️ Possible failure |
Please have a look at the darker-main-code and darker-test-code steps here for more details: https://github.com/MDAnalysis/mdanalysis/actions/runs/9865281726/job/27241908471
Please note: The black linter is purely informational, you can safely ignore these outcomes if there are no flake8 failures!
Codecov Report
Attention: Patch coverage is 96.68874% with 5 lines in your changes missing coverage. Please review.
Project coverage is 93.63%. Comparing base (
cfda8b7) to head (66b3fb7).
| Files | Patch % | Lines |
|---|---|---|
| package/MDAnalysis/analysis/sasa.py | 96.68% | 0 Missing and 5 partials :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## develop #4417 +/- ##
===========================================
+ Coverage 93.61% 93.63% +0.02%
===========================================
Files 171 172 +1
Lines 21243 21394 +151
Branches 3934 3970 +36
===========================================
+ Hits 19886 20032 +146
Misses 898 898
- Partials 459 464 +5
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
It is unfortunate that the MDAKit was not more widely publicized, leading to duplication of effort. @JureCerar, would it make sense for you to contribute the calculation of relative accessible surface area to the MDAKit (assuming it has not yet been implemented)?
@orbeckst I used and modified the main surface calculation code. As far as I know, the code from #4025 is copied from BioPython/SASA which is under BSD 3 license. Only _get_sphere and _single_frame are based of BioPython’s implementation. I changed the code it to fit MDAnalysis AnalysisBase class, added some tweaks, input values checks, and comments. Everything else is my own code: Relative SASA, tests, documentation, etc.
Thank you for the details. BSD 3 would be ok.
Have you compared output and performance of the code here to mdakit-sasa?
What does your code and mdakit-sasa have in common, where do they differ?
I checked the code. The main difference is mdakit-sasa is a wrapper for FreeSASA package. So the underlying algorithm is different. FreeSASA uses Lee-Richards algorithm where as this code uses Shrake-Rupley algorithm.
Performance wise I did not test it. But I figure FreeSASA (mdkit-sasa) is faster, as it's implemented in C? It's hard to make head-to-head comparison as the algorithm is different. This implementation finishes a 10 frame trajectory of a ~400 residue protein in about a minute or two, which I think is a reasonable speed. In any case, precision can be lowered if speed is needed.
Output wise, the result (i.e. area) is the same regardless of the method or package used.
Here it's also implemented the Relative Surface Area calculation which is a very useful to have when calculating protein surface properties. I guess it could also be implemented in mdkit-sasa?
Just as a side note. I similarly tried writing a wrapper for BioPython/SASA but it was very messy and I could not get it to work properly without writing a lot of temporary files.
Hi All.
As mention by @JureCerar mda_kit wraps the implementation FreeSASA in the BaseAnalysis class, and this kit is very simple as all the heavy lifting is done by FreeSASA:
- Free Sasa provides both Lee-Richards and Shrake-Rupley defaulting to the first one
- Includes a classifier that include a more comprehensive radii that mimics NACESS
Regarding performance, perforce is heavily driven by parametrisation, in the case of the Shanke-Rupley the number of points of the spheres are a main parameter if you use Gromacs SASA calculation the default parameters use very few points, FreeSASA have a nThread implementation builtin, but the kit do not implement parallelisation over multiple frames at the moment.
The reason for switching the PR to a kit initially was to separate FreeSASA dependency from core. Let me knot if there is something I can help with.
Regards.
Thanks @pegerto and @JureCerar ! Some of the developers are currently discussing how to best move forward. We'll keep you updated. Thank you for your patience!
kit do not implement parallelisation over multiple frames at the moment.
This might be very easy once we merge PR #4162 .