deepmd-kit icon indicating copy to clipboard operation
deepmd-kit copied to clipboard

[BUG] dp potential in lammps cannot be cleared from RAM

Open Johnsyisme opened this issue 3 years ago • 3 comments

Bug summary

Hi~! I want to do a series(maybe 10000+ times) of optimization calculations by LAMMPS with the dp potential to investigate the structure evolution

In the procedure like: Optimization > change the structure(change volume or move some atom ...) > Optimization And Lammps requires us to do clear every time before we do read_data new_structre

And Lammps will stop after a few hundred MD simulations because of Out Of Memory

I monitored this calculation and found that the used RAM increased several Gb when the new MD started And this problem didn't happen when I used the empirical potential

So I think the command clear in Lammps may not be able to remove the dp potential in RAM

The simple version Lammps input files are attached

If we cannot clear the potential out of RAM, Can we use the potential defined in the previous step after changing the structure?

best wishes! 20220721145415 OOM_dp_lammps.zip

DeePMD-kit Version

2.1.3

TensorFlow Version

2.9

LAMMPS Version

23Jun2022 (Built-in mode)

How did you download the software?

Built from source

Input Files, Running Commands, Error Log, etc.

Lammps INPUT: ` label loop variable i loop 200000

clear

units metal boundary p p p atom_style atomic neighbor 2.0 bin neigh_modify every 1 compute tmp all pe read_data carbon.data

#pair_style airebo 3.0 #pair_coeff * * CH.airebo C

pair_style deepmd test.pb pair_coeff * *

thermo 1 thermo_style custom step pe temp c_tmp minimize 1.0e-7 1.0e-10 1000 10000

next i jump SELF loop label break quit

`

Steps to Reproduce

Run Lammps with dp potential after a few hundred LOOPs there will be an OOM problem

Further Information, Files, and Links

No response

Johnsyisme avatar Jul 21 '22 06:07 Johnsyisme

session->Close(); may need to be called explicitly.

njzjz avatar Jul 21 '22 19:07 njzjz

Hi jinzhe! Sorry I didn't figure out where to call the session->Close()

Do you mean that I need to close and restart a new Lammps session for each MD simulation? it's actually what I did in the first place. because the time for LAMMPS initialization is much shorter when I use the empirical potential

But this simulation consists of 100000+ very short MD simulations, if I initialize LAMMPS for 100000 times with dp potential, it will take a lot of time

So I hope to do all the simulations at once in the procedure like: *Initialize Lammps > [MD > call python to modify structure ]10000+> next MD

But the RAM problem wouldn't allow me to do that many LOODs

best wishes!

Johnsyisme avatar Jul 22 '22 03:07 Johnsyisme

It looks that TF still has a memory leaking issue and they do not want to resolve it, so there is no way to clear the memory.

njzjz avatar Jul 24 '22 20:07 njzjz

I'll close this issue since most memory leaks happen in TensorFlow, and it seems that TensorFlow doesn't want to resolve this issue.

njzjz avatar Oct 18 '23 18:10 njzjz