Main w2v2 pretraining

Open artemru opened this issue 1 year ago • 0 comments

What does this PR do? Please describe: Merges MMS team's working fs2 branch into fs2 main. The PR contains

dataloading implementations for w2v2 pretraining and ASR finetuning
additional SpeechReadOptions fields
LLM-ASR model and decoding implementation
RNN-T ASR model and decoding implementation
small changes to load_wav2vec2_asr_trainer() to support validating on multiple dev sets
update wandb reporting to report metrics separately for all dev splits
support specifying score_metric to determine which metric to use for dev- model selection in ASR

We use this with main branch of mms2 repo

Apr 01 '25 09:04 artemru