dpgen How to restart training from NN in last iteration ?

Dpgen trains the neuron network from scratch in every iteration, and I wonder how the restart training from the train stage of last iteration.

Nov 02 '20 04:11 Xi-yuanWang

Modify record.dpgen in your directory.

Nov 02 '20 17:11 njzjz

You can add the following commands (for example) in the training model of param.json. In this case, from iteration 10 (training_reuse_iter), your training will restart based on the NN model of iteration 9. "training_reuse_iter": 10, "training_reuse_old_ratio": 0.2, "training_reuse_start_lr": 1e-4, "training_reuse_stop_batch": 200000, "training_reuse_start_pref_e": 0.1, "training_reuse_start_pref_f": 100,

Nov 03 '20 11:11 Manyi-Yang

Thanks, but it seems that I have to modify the param.json for every new iteration. Is there a more automatic way?

Nov 04 '20 03:11 Xi-yuanWang

No, You have no need to modify the param.json for every new iteration. "training_reuse_iter": 10, means that from iteration, your training will always restart based on the NN model of the latest one.

Nov 05 '20 10:11 Manyi-Yang

Thanks, then what does "training_reuse_old_ratio" mean?

Nov 05 '20 14:11 Xi-yuanWang

Since your training was based on the old model, which was trained using the structures generated from former iterations. and in the new training, we need to pay more attention to train on new configurations. So you can use this command to add only a partition of the structures from the former iteration to the new training set,.
"training_reuse_old_ratio" means: 0.2 means: in the new training set, only 20% structures are from the old iterations.

Nov 06 '20 09:11 Manyi-Yang

Thanks, but when restarting from last iteration, the deepmd raise an error "probablity doesn't sum to 1". How to deal with it?

Nov 09 '20 00:11 Xi-yuanWang

It won't occur when restarting training from the train stage of the last iteration now. If your problem hasn't been solved yet, could you provide your data for reproduction?

Jul 12 '22 08:07 HuangJiameng