fairseq-tensorboard
fairseq-tensorboard copied to clipboard
Small utility to monitor fairseq training in tensorboard
fairseq-tensorboard
NOTE: the functionality in this library is already present in fairseq since commit 257a3b8 (included in fairseq release 0.6.3).
This is a small utility to monitor fairseq training in tensorboard.
It is not a fork of fairseq, but just a small class that extends its functionality with tensorboard logging.
Installation and Usage
You just need to clone fairseq-tensorboard, install its only direct dependency
apart from fairseq itself
(tensorboardX) and launch
fairseq's train.py specifying as task monitored_translation:
pip install tensorboardX
git clone https://github.com/noe/fairseq-tensorboard.git
python fairseq/train.py \
--user-dir ./fairseq-tensorboard/fstb \
--task monitored_translation [...]
Features
- Logs fairseq training and validation losses.
- Saves sys.argv and fairseq's args for model traceability (see this).
- Allows plotting training and validation losses in the same plot (see this).
- Supports multi-GPU training.
FAQ
Why should I use fstb ?
Because it allows you to visually diagnose your losses!
You would change this:
...into this:
How can fairseq load fstb?
You have to provide fairseq with command line argument --user-dir with
the path of fstb. This instructs fairseq to load the fstb code, which
registers task monitored_translation.
Does fstb work with multi-GPU training?
Yes, it has been tested with single-node multi-GPU training. Only the first worker process logs to tensorboard. The behaviour of the remaining workers is unaltered.