mdanalysis icon indicating copy to clipboard operation
mdanalysis copied to clipboard

guess_TopologyAttrs guesses incorrect bonds for DMS lipids in MDAnalysis > 2.7.0

Open ricard1997 opened this issue 5 months ago • 5 comments

Expected Behavior

When using guess_TopologyAttrs to generate bonds for DSM lipids, the bonds around atom C3S (or any C*S) should be guessed correctly. For example:

("C3S", "C4S"), ("C3S", "C2S"), ("C3S", "H3S"), ("C3S", "H3T")

Actual Behavior

In MDAnalysis versions > 2.7.0, guess_TopologyAttrs produces incorrect extra bonds, e.g.:

[('C3S', 'C5S'), ('C3S', 'H4T'), ('C3S', 'H4S'), ('C3S', 'C4S'), ('C3S', 'HO3'), ('C3S', 'O3'), ('C3S', 'H3S'), ('C3S', 'H2S'), ('C3S', 'C2S'), ('C3S', 'NF'), ('C3S', 'C1S')]

This only happens for the lipid tail with atoms labeled CS. The other tail with CT behaves as expected.

Code to Reproduce import MDAnalysis as mda

Load structure with DMS lipid (minimal example)

u = mda.Universe("lipid_dms.pdb") structure = u.select_atoms("resname DSM")

Guess bonds

structure.guess_TopologyAttrs(to_guess=["bonds"], fudge_factor=0.5)

for b in structure.bonds: print(b)

Version Information

MDAnalysis: 2.7.0 → works as expected

MDAnalysis: > 2.7.0 → produces incorrect bonds

Python: 3.12

OS:Linux

Additional Notes

It seems that atom names like C3S may be misinterpreted as the element cesium (Cs), which has a much larger van der Waals radius, leading to spurious bonds.

ricard1997 avatar Aug 27 '25 16:08 ricard1997

@lilyminium do you know if this issue is related to the switch-over to the new guesser system?

orbeckst avatar Sep 30 '25 22:09 orbeckst

Quite possibly. @ricard1997 could you please provide the PDB you're using?

lilyminium avatar Sep 30 '25 22:09 lilyminium

He I attach the file. It is a .gro file and I have changed the extension to .csv for github allow me to upload. Also, in my statement I did an error, the lipid with problem is DSM not DMS.

membrane.csv

ricard1997 avatar Oct 01 '25 16:10 ricard1997

I appreciate that this is an annoying case of C*S being converted to CS instead of C - but honestly I think the answer here is that the PDB file should have had elements assigned to it, not that we should have attempted to correctly guess the difference between CS and C in an unclear atom name.

IAlibay avatar Nov 18 '25 01:11 IAlibay

@ricard1997 , which program and force field generated your file?

Do you have a topology file (e.g., PSF or TPR) that you could use?

orbeckst avatar Nov 19 '25 23:11 orbeckst