transformers icon indicating copy to clipboard operation
transformers copied to clipboard

Two tfevent files are being generated for each run of trainer

Open XanderWA opened this issue 2 years ago • 4 comments

System Info

Each run of the trainer two tfevent files are being generated, it looks like this: /runs --Feb27_09-46-42_... ----events.out.tfevents....0 ----/1677491207.0429652 ------events.out.tfevents....1

When reading these files with TensorBoard I don't get any output from the .1 file, how can I get rid of it (because these clutter my TensorBoard) or get actual data from it?

Thanks in advance

@sgugger

Who can help?

No response

Information

  • [ ] The official example scripts
  • [ ] My own modified scripts

Tasks

  • [ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • [ ] My own task or dataset (give details below)

Reproduction

Simply starting any training.

Expected behavior

One tfevents file or a valid output from the .1 file.

XanderWA avatar Feb 27 '23 11:02 XanderWA

I have no idea. Let us know if you find the reason/how to fix it!

sgugger avatar Feb 28 '23 12:02 sgugger

also face this issue by adding the tensorboard callback in the examples/language_modeling/run_mlm.py

Ch4osMy7h avatar Mar 02 '23 03:03 Ch4osMy7h

Face the same issue:

  • transformers: 4.26.1
  • tensorboard: 2.12.0

Sanster avatar Mar 02 '23 03:03 Sanster

Also faces this issue As a workaround, I use this command to delete the duplicated directory if somebody is really annoyed by this.

 find . -type d -name "*[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9].[0-9][0-9][0-9][0-9][0-9][0-9]" -exec rm -rv {} \;

hwijeen avatar Mar 13 '23 19:03 hwijeen

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Apr 07 '23 15:04 github-actions[bot]