yws

Results 4 comments of yws

> I am not sure whether it is a bug in the code: Deleting older checkpoint [/home/azureuser/cloudfiles/code/Users/jbing/code/dolly/local_output_dir_0325/checkpoint-1400] due to args.save_total_limit > > My latest checkpoint is checkpoint-1400 and the one...

> I am retrying by changing the following parameters in the trainer code: save_total_limit=3, load_best_model_at_end=False, > > I think it might be a bug in transformers. > > @yinwangsong Do...

> > "and for practical use, you can use the accumulation attention scores obtained from the whole prefilling stage" > > Did you use scores from prefilling stage for any...