[BUG] dp potential in lammps cannot be cleared from RAM
Bug summary
Hi~! I want to do a series(maybe 10000+ times) of optimization calculations by LAMMPS with the dp potential to investigate the structure evolution
In the procedure like:
Optimization > change the structure(change volume or move some atom ...) > Optimization
And Lammps requires us to do clear every time before we do read_data new_structre
And Lammps will stop after a few hundred MD simulations because of Out Of Memory
I monitored this calculation and found that the used RAM increased several Gb when the new MD started And this problem didn't happen when I used the empirical potential
So I think the command clear in Lammps may not be able to remove the dp potential in RAM
The simple version Lammps input files are attached
If we cannot clear the potential out of RAM, Can we use the potential defined in the previous step after changing the structure?
best wishes!
OOM_dp_lammps.zip
DeePMD-kit Version
2.1.3
TensorFlow Version
2.9
LAMMPS Version
23Jun2022 (Built-in mode)
How did you download the software?
Built from source
Input Files, Running Commands, Error Log, etc.
Lammps INPUT: ` label loop variable i loop 200000
clear
units metal boundary p p p atom_style atomic neighbor 2.0 bin neigh_modify every 1 compute tmp all pe read_data carbon.data
#pair_style airebo 3.0 #pair_coeff * * CH.airebo C
pair_style deepmd test.pb pair_coeff * *
thermo 1 thermo_style custom step pe temp c_tmp minimize 1.0e-7 1.0e-10 1000 10000
next i jump SELF loop label break quit
`
Steps to Reproduce
Run Lammps with dp potential after a few hundred LOOPs there will be an OOM problem
Further Information, Files, and Links
No response
session->Close(); may need to be called explicitly.
Hi jinzhe!
Sorry I didn't figure out where to call the session->Close()
Do you mean that I need to close and restart a new Lammps session for each MD simulation? it's actually what I did in the first place. because the time for LAMMPS initialization is much shorter when I use the empirical potential
But this simulation consists of 100000+ very short MD simulations, if I initialize LAMMPS for 100000 times with dp potential, it will take a lot of time
So I hope to do all the simulations at once in the procedure like: *Initialize Lammps > [MD > call python to modify structure ]10000+> next MD
But the RAM problem wouldn't allow me to do that many LOODs
best wishes!
It looks that TF still has a memory leaking issue and they do not want to resolve it, so there is no way to clear the memory.
I'll close this issue since most memory leaks happen in TensorFlow, and it seems that TensorFlow doesn't want to resolve this issue.