nmt
nmt copied to clipboard
Checkpoint may lost when the number of metrics is larger than two
Since the numbuer of checkpoint is limited which is defined in model.py as follow:
self.saver = tf.train.Saver(
tf.global_variables(), max_to_keep=hparams.num_keep_ckpts)
and the number of metrics is not limited which is used in train.py as follow:
if save_on_best and scores[metric] > getattr(hparams, best_metric_label):
setattr(hparams, best_metric_label, scores[metric])
model.saver.save(
sess,
os.path.join(
getattr(hparams, best_metric_label + "_dir"), "translate.ckpt"),
global_step=model.global_step)
If the number of metrics is larger than two, the saved checkpoint may lost.
If the number of metrics is larger than num_keep_ckpts, the saved checkpoint certainly lost.