BioSimSpace Coupling groups in Gromacs

Is your feature request related to a problem? Please describe. Hi, I tried to use BSS to run MD simulations for a protein-ligand complex using Gromacs. Looking at the configuration file, I noticed that the coupling groups for temperature coupling were not appropriate. In the BSS protocol, it is tc-grps = system tau-t = 2.0 ref-t = 300.00

Instead, it should be tc-grps = Protein_LIG Water_and_ions tau-t = 2.0 2.0 ref-t = 300 300

And this would also require the generation of the appropriate groups in the index file using gmx make_ndx.

I did not check it, but I imagine that this would be an issue also for proteins (apo) in water where the coupling groups should be tc-grps = Protein Non-Protein

Describe the solution you'd like I am not sure how to solve this issue, because it may change from system to system. Maybe an additional argument coupling_groups in the functions for Gromacs protocols would do it (with an inside loop that writes the tau-t and ref-t parameters).

Describe alternatives you've considered Alternatively, one could write a function that generates the index file and tries to smartly "guess" the coupling groups from there. For example, if you have gmx make_ndx -f em.gro -o index.ndx

0 System              : 33506 atoms <\br>
1 Protein              :  2614 atoms
11 non-Protein    : 30892 atoms
13 LIG                  :    22 atoms
15 Water              : 30864 atoms
17 non-Water      :  2642 atoms
18 Ion                  :     6 atoms
19 LIG                  :    22 atoms
20 CL                   :     6 atoms
21 Water_and_ions      : 30870 atoms

The function could be

def get_coupling_groups(index_file):
    if Water_and_ions in index_file:
         group1=Water_and_ions 
    elif Water in index_file:
        group1=Water   
    if Protein in index_file:
        if N_atoms_Protein + N_atoms_group1 == N_atoms_System:
            group2=Protein 
    ...

...But it will be hard this way to figure out all possibilities (e.g. multiple ligands, solvent mixtures,...)

Thanks, Carmen

Sep 07 '22 08:09 cespos

Thanks for this. Do you know if there's a definitive guide for how to best define temperature coupling groups? When writing the GROMACS config we went with a single group for simplicity, which was what was used in some of the tutorials that I followed, and also what was recommended by the developers on their mailing list for systems with a large amount of solvent vs solute, which is what we are typically using.

I'd be happy to improve this as necessary, but agree that it might become complicated to make sure it works reliably for an arbitrary system.

Cheers.

Sep 07 '22 09:09 lohedges

The rule is usually to have a solvent group and a solute group to avoid the "hot solvent/cold solute" problem. There are a number of references that talk about this problem and this is one.

I tried to write a function get_coupling_groups() to define a solute group and a solvent group for GROMACS. It will output the group names and will also automatically generate the index file. Of course it can be improved but should be a good start.

# gromacs utils
from subprocess import Popen, PIPE
import subprocess

def make_ndx(coord_file, output_basename, groups_to_merge=None):
    # write script to generate index file
    make_ndx_script = ['#!/bin/bash', f'gmx make_ndx -f {coord_file} -o index_{output_basename}.ndx<<EOF']
    if groups_to_merge is not None:
        if not isinstance(groups_to_merge, str) and hasattr(groups_to_merge, '__iter__'):
            for g in groups_to_merge:
                if g is not None:
                    make_ndx_script.append(g)
        elif isinstance(groups_to_merge, str):
            make_ndx_script.append(groups_to_merge)
        else:
            raise ValueError('groups_to_merge {groups_to_merge} not valid')
    make_ndx_script.append('q')
    make_ndx_script.append('EOF')
    make_ndx_script.append('')
    f = open('make_ndx.sh', 'w')
    f.write('\n'.join(make_ndx_script) + '\n')
    f.close()
    chmod = subprocess.call('chmod +x make_ndx.sh', shell=True)
    job = subprocess.call('./make_ndx.sh', shell=True)
    if job != 0:
        print("ERROR: Index file could not be generated")
    subprocess.call('rm make_ndx.sh', shell=True)


def get_coupling_groups(coord_file, output_basename='test', 
                        solute_groups = ['Protein', 'DNA', 'RNA', 'Other'],
                       ):
    try:
        p = Popen(['gmx', 'make_ndx', '-f', coord_file, '-o', 'index.ndx'], stdin=PIPE, stdout=PIPE, stderr=PIPE,
              universal_newlines=True)
        output, err = p.communicate()
        rc = p.returncode
    except:
        raise ValueError('Program gmx make_ndx not found. Ensure that the gromacs module is loaded.')
    
    if 'output' not in locals() or len(output) == 0:
        raise ValueError('Program gmx make_ndx not found. Ensure that the gromacs module is loaded.')
        
    groups_lines = []
    flag=0
    for line in output.splitlines():
        if 'Analysing Protein...' in line:
            flag=1
        if flag==1 and ':' in line and 'atoms' in line:
            groups_lines.append(line)

    groups_ID_dict = dict([(s.split()[1],int(s.split()[0])) for s in groups_lines])
    groups_atoms_dict = dict([(s.split()[1],int(s.split()[3])) for s in groups_lines])
        
    #
    found_solute_groups = {}
    found_solvent_groups = {}

    if 'Water_and_ions' in groups_ID_dict.keys():
        found_solvent_groups['Water_and_ions'] = groups_ID_dict['Water_and_ions']
    elif 'Water_and_Ions' in groups_ID_dict.keys():
        found_solvent_groups['Water_and_Ions'] = groups_ID_dict['Water_and_Ions']
    elif 'Water' in groups_ID_dict.keys():
        found_solvent_groups['Water'] = groups_ID_dict['Water']

    for gr_name, gr_id in groups_ID_dict.items():
        if gr_name in solute_groups:
            found_solute_groups[gr_name] = gr_id

    atoms_solute_groups = sum([groups_atoms_dict[solute_groups] for solute_groups in found_solute_groups.keys()])
    atoms_solvent_groups = sum([groups_atoms_dict[group2] for group2 in found_solvent_groups.keys()])

    if not atoms_solvent_groups + atoms_solute_groups == groups_atoms_dict['System']:
        merge_new_solvent_group = ' &! '.join([str(0)] + [str(s) for s in found_solute_groups.values()])
        out_solvent_group = '_&_!'.join([str(s) for s in ['System']+list(found_solute_groups.keys())])
        print(f'WARNING: {list(found_solvent_groups.keys())[0]} group atoms and solute group atoms do not sum up to system atoms')
        print(f'WARNING: Groups not identified as solute will be considered as solvent')
        print(f'WARNING: New solvent group: {solvent_groups}')
    else:
        merge_new_solvent_group = None
        out_solvent_group = list(found_solvent_groups.keys())[0]
        
    if len(found_solute_groups.keys()) > 1:
        merge_new_solute_group = ' | '.join([str(s) for s in found_solute_groups.values()])
        out_solute_group = '_'.join([str(s) for s in found_solute_groups.keys()])
    else:
        merge_new_solute_group=None
        out_solute_group = list(found_solute_groups.keys())[0]
    
    make_ndx(coord_file, output_basename, groups_to_merge=merge_new_solute_group)
    return out_solute_group, out_solvent_group

Sep 07 '22 12:09 cespos