Water molecules not recognised as separate molecules
Hello,
I have found an issue, where I try to load in a system containing a protein and water molecules from AMBER parm7 and rst7 files. In short, BioSimSpace doesn't recognise the water molecules as water molecules (even if I rename HOH to WAT, as well as the atomtypes accordingly). What's even more strange is that all water molecules are recognised as different residues of the same molecule, which I think shouldn't happen if there are no bonded terms between them.
Is there any way to ensure that BioSimSpace will recognise water molecules properly when preparing these files in general?
EDIT: If I convert the system to a gro and top file using ParmEd and load these files back into BioSimSpace, these molecules are now recognised properly. So maybe this is related to the AMBER parser?
Many thanks.
For reference, how were these files generated in the first place? Also, what water model is used?
The AMBER topology parser uses a discoverMolecules function to divide the atoms into molecules by traversing the bond information in the file., i.e. using the BONDS_INC_HYDROGEN and BONDS_WITHOUT_HYDROGEN records. If this info is incorrect, then molecules might no be split correctly.
Cheers.
Sorry for not being very helpful, but I made this system long ago and I don't remember how I generated this, I will update if I recall something. The waters are intended to be TIP3P though.
Is it possible to tell BioSimSpace that certain residues are waters so it treats them accordingly? I have noticed it handles automatic renaming based on the MD engine so I imagine it already has some similar functionality?
Sire does have a bondHunter that can be used to break molecules. However, it's tricky when the molecules already have properties, since these will need to be broken too. Here the bonding is presumably incorrect, so would need to be rebuilt following the break, with a new bond property added.
We don't care about the naming in AMBER files, so don't treat atoms any differently if they have water labels. We only rename on write since AMBER expects specific naming internally, e.g. when using fast three-point waters.
For now it might just be worth using the GROMACS files generated by ParmEd. I'll try to figure out what's wrong (either in the file, or the parser).
As Lester said, Sire's Amber parser uses the bond information present in the parm7 file to split atoms into molecules. My guess is that the bond information is missing for the system, and so Sire can only assume that they are all part of one molecule.
Sire can automatically generate the bonds and split atoms into molecules using the bondHunter functionality that Lester mentions. I have been working on an updated molecule loader that detects when bonds aren't present and automatically adds bonding (and will) automatically split these "molecules" into individual molecules. We have to be very careful when doing this, as automatically detecting bonds in based on testing if atoms are closer than their covalent radii (plus some chemical checks). This isn't 100% reliable, although almost always works for water molecules. For now, it is probably best to accept that this is a missing feature of BioSimSpace that parmed has, and that would be worth including. BioSimSpace will get this feature for free once we've finished the updating work in Sire that is ongoing at the moment, and then BioSimSpace is edited to make use of the new molecule loading functionality.