gpu4pyscf icon indicating copy to clipboard operation
gpu4pyscf copied to clipboard

Scanner for QMMMSCF

Open wuchiz opened this issue 11 months ago • 11 comments

I would like to use scanner (start from last result) in QMMM-PBC simulation. However, directly applying itrf.add_mm_charges to the instance created by pyscf.M will results in an error. If I use set_geom_ to update the coordinate of MM atoms and use scanner, the results is different. So how shall I do?

wuchiz avatar Mar 19 '25 08:03 wuchiz

It seems directly passing the density matrices from previous calculation by dm argument to kernel is ok.

wuchiz avatar Mar 19 '25 10:03 wuchiz

@wuchiz Thanks for the feedback. Can you provide the script to reproduce the issue? I didn't completely follow your operations.

wxj6000 avatar Mar 20 '25 20:03 wxj6000

@wxj6000 Take a simple example to explain, I have two conformations mol1 (QM)/coord_mm1(MM) and mol2(QM)/coord_mm2(MM).

mol1 = pyscf.M(atom=atom1, basis='3-21g', max_memory=40000)
mol2 = pyscf.M(atom=atom2, basis='3-21g', max_memory=40000)
mf_GPU = rks.RKS(mol,xc='b3lyp').density_fit()

coord_mm1 = np.array( [[1,2,-1],[3,4,5]])
coord_mm2 = np.array( [[1,2.5,-1],[3,4,5]])
mf_GPU = itrf.add_mm_charges(mf_GPU,coord_mm1, np.eye(3)*12, np.array([-0.8,0.8]), [0.8,1.2], rcut_ewald=8, rcut_hcore=6)
e_scanner = mf_GPU.as_scanner()

I would like to do quantum calculation for conformation 2 from the results of conformation 1 (SCF). For example these two configuration are adjacent frames in simulation. Following the tutorial, scanner can receive the mol2 as input to return qm results following the previous qm calculation result. But mol2 does not include the information of MM atoms. And I also can not include the MM information by applying itrf.add_mm_charges to mol2. That is to say,

itrf.add_mm_charges(mol2,coord_mm1, np.eye(3)*12, np.array([-0.8,0.8]), [0.8,1.2], rcut_ewald=8, rcut_hcore=6)

will results in a error.

So I update the coordinate of MM atoms by set_geom_ and do calculation with mol2.

e_scanner.mm_mol.set_geom_(coord_mm2)
e = e_scanner(mol2)

the returned force and energy will be very different from normal one.

mol2 = pyscf.M(atom=atom2, basis='3-21g', max_memory=40000)
mf_GPU = rks.RKS(mol,xc='b3lyp').density_fit()
coord_mm2 = np.array( [[1,2.5,-1],[3,4,5]])
mf_GPU = itrf.add_mm_charges(mf_GPU,coord_mm2, np.eye(3)*12, np.array([-0.8,0.8]), [0.8,1.2], rcut_ewald=8, rcut_hcore=6)
e = mf_GPU.kernel()

For the time being, I just store the density matrix derived from make_rdm1() in last calculation and pass this to the kernel, that is to say:

mol2 = pyscf.M(atom=atom2, basis='3-21g', max_memory=40000)
mf_GPU = rks.RKS(mol,xc='b3lyp').density_fit()
coord_mm2 = np.array( [[1,2.5,-1],[3,4,5]])
mf_GPU = itrf.add_mm_charges(mf_GPU,coord_mm2, np.eye(3)*12, np.array([-0.8,0.8]), [0.8,1.2], rcut_ewald=8, rcut_hcore=6)
e = mf_GPU.kernel(dm0=dm)

where dm is density matrix of conformation 1.

wuchiz avatar Mar 21 '25 07:03 wuchiz

By the way, If my simulation program can return a cupy array containing coordinates in gpu, can gpu4pyscf take this as input to avoid the data transfer between cpu and gpu?

wuchiz avatar Mar 21 '25 07:03 wuchiz

@wuchiz I found two issues in your script. First

itrf.add_mm_charges(mol2,coord_mm1, np.eye(3)*12, np.array([-0.8,0.8]), [0.8,1.2], rcut_ewald=8, rcut_hcore=6)

does not work because add_mm_charges takes mf object in the first argument, not mol object. Second, Bohr unit is used internally in qmmm.pbc. You will need to convert it from Angstrom to Bohr when calling mm_mol.set_geom_. (cc @MoleOrbitalHybridAnalyst , mm_mol.unit is hardcoded now. There are some compatibility issues when calling functions derived from cell or mole.)

Here is the script after the corrections.

import numpy as np
import cupy
import pyscf
from pyscf.lib import param
from gpu4pyscf.scf import hf
from gpu4pyscf.qmmm.pbc import itrf, mm_mole
from gpu4pyscf.dft import rks

atom ='''
O       0.0000000000    -0.0000000000     0.1174000000
H      -0.7570000000    -0.0000000000    -0.4696000000
H       0.7570000000     0.0000000000    -0.4696000000
'''

atom1 ='''
O       0.0100000000    -0.0000000000     0.1174000000
H      -0.7570000000    -0.0000000000    -0.4696000000
H       0.7570000000     0.0000000000    -0.4696000000
'''

atom2 ='''
O       -0.0100000000    -0.0000000000     0.1174000000
H      -0.7570000000    -0.0000000000    -0.4696000000
H       0.7570000000     0.0000000000    -0.4696000000
'''

mol = pyscf.M(atom=atom, basis='3-21g', max_memory=40000)
mol1 = pyscf.M(atom=atom1, basis='3-21g', max_memory=40000)
mol2 = pyscf.M(atom=atom2, basis='3-21g', max_memory=40000)
mf_GPU = rks.RKS(mol,xc='b3lyp').density_fit()

coord_mm1 = np.array( [[1,2,-1],[3,4,5]])
coord_mm2 = np.array( [[1,2.5,-1],[3,4,5]])

# Method #1
mf_GPU = itrf.add_mm_charges(mf_GPU, coord_mm2, np.eye(3)*12, np.array([-0.8,0.8]), [0.8,1.2], rcut_ewald=8, rcut_hcore=6)
e_scanner = mf_GPU.as_scanner()
e_scanner.mm_mol.set_geom_(coord_mm2/param.BOHR)
e = e_scanner(mol2)

# Method #2
mf_GPU = rks.RKS(mol,xc='b3lyp').density_fit()
e_scanner = mf_GPU.as_scanner()

# Update MM atoms and charges
e_scanner_with_mm = itrf.add_mm_charges(e_scanner, coord_mm2, np.eye(3)*12, np.array([-0.8,0.8]), [0.8,1.2], rcut_ewald=8, rcut_hcore=6)
e = e_scanner_with_mm(mol2)

# Check results
mf_GPU = rks.RKS(mol2,xc='b3lyp').density_fit()
coord_mm2 = np.array( [[1,2.5,-1],[3,4,5]])
mf_GPU = itrf.add_mm_charges(mf_GPU, coord_mm2, np.eye(3)*12, np.array([-0.8,0.8]), [0.8,1.2], rcut_ewald=8, rcut_hcore=6)
e = mf_GPU.kernel()

wxj6000 avatar Mar 21 '25 21:03 wxj6000

By the way, If my simulation program can return a cupy array containing coordinates in gpu, can gpu4pyscf take this as input to avoid the data transfer between cpu and gpu?

GPU4PySCF does not take CuPy arrays for the atomic coordinates. The latency of these data transfers between CPU and GPU is almost negligible compared with SCF calculations. If you find that the cost of data transfer is significant, please let us know.

wxj6000 avatar Mar 21 '25 21:03 wxj6000

I recommend building mol and mf every step instead of using the scanner because the cutoff for multipoles can be updated on the fly.

Here is an example https://github.com/MoleOrbitalHybridAnalyst/qmmm_archive/blob/main/33241wat/energy_conservation/pbe/nve/qmmm_efv.py

I recommend building mol and mf every step instead of using the scanner because the cutoff for multipoles can be updated on the fly.

OK. If the scanner is not recommended, I will disable it in the future.

And I feel that a simple example of MD simulations with QM/MM module is quite useful.

wxj6000 avatar Mar 22 '25 02:03 wxj6000

@wxj6000 @MoleOrbitalHybridAnalyst. Thank you so much for your help! I have already integrated QMMM-PBC into Sponge (https://spongemm.cn/zh/home) successfully following the example. It's a quite useful example for me.

For the methods of using scanner provided by @wxj6000, I slightly changed the code finding that simply modifying the mm coordinates by set_geom_ still resulted in wrong results (method 1). But applying add_mm_charge to original scanner worked well. It seems that some important quantities in QMMM class will not be updated if just changing MM coordinate by set_geom_.

import numpy as np
import cupy
import pyscf
from pyscf.lib import param
from gpu4pyscf.scf import hf
from gpu4pyscf.qmmm.pbc import itrf, mm_mole
from gpu4pyscf.dft import rks

atom ='''
O       0.0000000000    -0.0000000000     0.1174000000
H      -0.7570000000    -0.0000000000    -0.4696000000
H       0.7570000000     0.0000000000    -0.4696000000
'''

atom1 ='''
O       0.0100000000    -0.0000000000     0.1174000000
H      -0.7570000000    -0.0000000000    -0.4696000000
H       0.7570000000     0.0000000000    -0.4696000000
'''

atom2 ='''
O       -0.0100000000    -0.0000000000     0.1174000000
H      -0.7570000000    -0.0000000000    -0.4696000000
H       0.7570000000     0.0000000000    -0.4696000000
'''

mol = pyscf.M(atom=atom, basis='3-21g', max_memory=40000)
mol1 = pyscf.M(atom=atom1, basis='3-21g', max_memory=40000)
mol2 = pyscf.M(atom=atom2, basis='3-21g', max_memory=40000)
mf_GPU = rks.RKS(mol,xc='b3lyp').density_fit()

coord_mm1 = np.array( [[1,2,-1],[3,4,5]])
coord_mm2 = np.array( [[1,2.5,-1],[3,4,5]])


# Method #1
mf_GPU = itrf.add_mm_charges(mf_GPU, coord_mm1, np.eye(3)*12, np.array([-0.8,0.8]), [0.8,1.2], rcut_ewald=8, rcut_hcore=6)
e_scanner = mf_GPU.as_scanner()
e1 = e_scanner(mol1)
e_scanner.mm_mol.set_geom_(coord_mm2/param.BOHR)
e2_1 = e_scanner(mol2)

# Method #2
mf_GPU = rks.RKS(mol,xc='b3lyp').density_fit()
mf_GPU = itrf.add_mm_charges(mf_GPU, coord_mm1, np.eye(3)*12, np.array([-0.8,0.8]), [0.8,1.2], rcut_ewald=8, rcut_hcore=6)
e_scanner = mf_GPU.as_scanner()
e1 = e_scanner(mol1)

# Update MM atoms and charges
e_scanner_with_mm = itrf.add_mm_charges(e_scanner, coord_mm2, np.eye(3)*12, np.array([-0.8,0.8]), [0.8,1.2], rcut_ewald=8, rcut_hcore=6)
e2_2 = e_scanner_with_mm(mol2)

# Check results
mf_GPU = rks.RKS(mol2,xc='b3lyp').density_fit()
coord_mm2 = np.array( [[1,2.5,-1],[3,4,5]])
mf_GPU = itrf.add_mm_charges(mf_GPU, coord_mm2, np.eye(3)*12, np.array([-0.8,0.8]), [0.8,1.2], rcut_ewald=8, rcut_hcore=6)
e2_ref = mf_GPU.kernel()

wuchiz avatar Mar 28 '25 09:03 wuchiz

@wuchiz Great! set_geom_ will be disabled in the future. See https://github.com/pyscf/gpu4pyscf/pull/359

wxj6000 avatar Mar 28 '25 15:03 wxj6000