BioSimSpace icon indicating copy to clipboard operation
BioSimSpace copied to clipboard

Problem when converting Gro/Top to Rst7

Open msuruzhon opened this issue 3 years ago • 12 comments

Hello,

I have run into an issue when trying to load an alchemical system generated by a GROMACS run, squashing it and writing it out as an AMBER rst7 (writing out to GROMACS Gro is fine). This is the script I used for these inputs:

import BioSimSpace.Sandpit.Exscientia as BSS

system = BSS.IO.readMolecules(["gromacs.top", "gromacs_out.gro"])
squashed_system, _ = BSS.Align._squash._squash(system)
BSS.IO.saveMolecules("amber", squashed_system, "rst7")

The most comprehensive error message I managed to get was the following:

Failed to write the file <...> using the parser for fileformat 'RST7'. Errors reported by the parser are:
Errors converting the system to a Amber Rst7 format...
Could not write the float at index 126, value '-1039.54' into the specified format AmberFormat( 6 x float[width = 12, precision = 7] ). (call sire.error.get_last_error_details() for more info)

I hope it helps to investigate the issue. Many thanks.

msuruzhon avatar Dec 16 '22 09:12 msuruzhon

Hi there, we still only have write support for GROMACS alchemical topologies (at least ones set up within BioSimSpace), so this is likely the issue. I'll try to debug this afternoon.

Cheers.

lohedges avatar Dec 16 '22 10:12 lohedges

Hmmm, I can't even read the files, which isn't surprising if the topology corresponds to an alchemical system:

import BioSimSpace.Sandpit.Exscientia as BSS
BSS.setVerbose(True)

system = BSS.IO.readMolecules(["gromacs.top", "gromacs_out.gro"])

gives:

OSError: SireError::io_error: There is not enough information in this parser (Gro87( title() = GRoups of Organic Molecules in ACtion for Science, nAtoms() = 6457, nResidues() = 2147, nFrames() = 1, hasCoordinates() = 1, hasVelocities() = 1 )) to start the creation of a new System. You need to use a more detailed input file. (call sire.error.get_last_error_details() for more info)

lohedges avatar Dec 16 '22 11:12 lohedges

The new sire load interface will give you more information about any warnings or errors when reading files. Add show_warnings=True to the load. There is also nascent support for capturing a log during a read.

log = {}
mols = sire.load([filenames], show_warnings=True, log=log)

chryswoods avatar Dec 16 '22 12:12 chryswoods

This what I get when I try to read it

== /Users/chzcjw/Downloads/Archive-2/gromacs.top ==

This file could not be parsed by any of the file parsers! It was recognised as a file of type top,prm7, but all parsers failed to parse this file. The errors from the parsers associated with the suffix top,prm7 are printed below:

*-- Failed to parse '/Users/chzcjw/Downloads/Archive-2/gromacs.top' with parser 'GroTop'.
Cannot find the file 'posre_0001.itp' using GROMACS_PATH = [ /Users/chzcjw/sire.app/share/Sire/gromacs ], current directory '/Users/chzcjw/Downloads/Archive-2'. Please make sure the file exists and is readable within your GROMACS_PATH from the current directory '/Users/chzcjw/Downloads/Archive-2' (e.g. set the GROMACS_PATH environment variable to include the directory that contains 'posre_0001.itp', or copy this file into one of the existing directories [ /Users/chzcjw/sire.app/share/Sire/gromacs ])

*-- Failed to parse '/Users/chzcjw/Downloads/Archive-2/gromacs.top' with parser 'PRM7'.
The file is not recognised as being of the required format.

chryswoods avatar Dec 16 '22 12:12 chryswoods

Ah apologies, I should have attached the itp as well, this should do it now: Archive.zip

msuruzhon avatar Dec 16 '22 13:12 msuruzhon

Okay, I now see the same error. I'll try to work out what term isn't fitting within the AMBER formatting restrictions following conversion.

lohedges avatar Dec 16 '22 13:12 lohedges

It happens if you just use the gro file, i.e. you don't even load the topology. I'm guessing that one of the velocities is overflowing the formatting.

lohedges avatar Dec 16 '22 13:12 lohedges

Yes, that's it:

import BioSimSpace as BSS

s = BSS.IO.readMolecules("gromacs_out.gro")
m = s[0]

# This fails.
BSS.IO.saveMolecules("test", m, "rst7")
OSError: SireError::io_error: Cannot write the (perhaps some of the ) files as the following errors occurred:
Failed to write the file '/home/lester/Downloads/Archive/test.rst7' using the parser for fileformat 'RST7'. Errors reported by the parser are:
Errors converting the system to a Amber Rst7 format...
Could not write the float at index 126, value '-1039.54' into the specified format AmberFormat( 6 x float[width = 12, precision = 7] ). (call sire.error.get_last_error_details() for more info)

# Delete the velocity property.
m._sire_object = m._sire_object.edit().removeProperty("velocity").molecule().commit()

# Try again. This now works.
BSS.IO.saveMolecules("test", m, "rst7")
['/home/lester/Downloads/Archive/test.rst7']

Does your _squash function remove the velocity property, or do something to it?

lohedges avatar Dec 16 '22 14:12 lohedges

Also, for a quick hack, you could just use the property_map to make Sire search for an invalid (missing) velocity property key, in which case the velocities would be ignored. For example:

s = BSS.IO.readMolecules(["gromacs.top", "gromacs_out.gro"])
BSS.IO.saveMolecules("test", s, "rst7", property_map={"velocity" : "cheese"})
['/home/lester/Downloads/Archive/test.rst7']

(Obviously this isn't helpful if the velocities are important.)

For dealing with the issue more generally, it might be possible to cap the velocities to some maximum value. I'll also check to see whether the unit conversion between GROMACS and AMBER is correct.

Cheers.

lohedges avatar Dec 16 '22 14:12 lohedges

Thanks @lohedges, I think I should be handling the velocities correctly, so I guess that means nothing out of the ordinary is happening, just that the velocities are too big for the AMBER format? It might be nice to automatically cap these on write out if there is no other way of doing it.

msuruzhon avatar Dec 16 '22 14:12 msuruzhon

I think we need a better error message so that you know the velocities don't fit, and you can choose to cap the velocities (i.e. via a function to cap the velocities before you try to write them to a rst file).

My reasoning is that we would lose information and violate our principle of symmetric read/write if we capped the velocities (the system read back in would have different velocities to that were written out).

chryswoods avatar Dec 16 '22 15:12 chryswoods

Yes, that's a good point. At present this is triggered by the writeFloatData function from amberformat.cpp, so the function doesn't know what property the data corresponds to. This could easily be updated to pass in the property name so that it could be printed alongside the existing error message. A capping function should be easy enough to add in BioSimSpace too.

lohedges avatar Dec 16 '22 15:12 lohedges