fairseq2
fairseq2 copied to clipboard
Main w2v2 pretraining
What does this PR do? Please describe:
Merges MMS team's working fs2 branch into fs2 main. The PR contains
- dataloading implementations for w2v2 pretraining and ASR finetuning
- additional
SpeechReadOptionsfields - LLM-ASR model and decoding implementation
- RNN-T ASR model and decoding implementation
- small changes to
load_wav2vec2_asr_trainer()to support validating on multiple dev sets - update wandb reporting to report metrics separately for all dev splits
- support specifying
score_metricto determine which metric to use for dev- model selection in ASR
We use this with main branch of mms2 repo