fairseq2
fairseq2 copied to clipboard
Adding support for wav2vec2 large models.
What does this PR do? Please describe: Adds support for wav2vec2 large models trained on Librispeech 960h (ls960) and LibriVox 60k (lv60k).
Tested with this simple code snippet: lv60k
from fairseq2.models.wav2vec2.setup import load_wav2vec2_model
import torch
model = load_wav2vec2_model("wav2vec2_large_lv60k", device=torch.device("cuda:0"), dtype=torch.float32)
print(model)
Output: P1230849615
ls960:
from fairseq2.models.wav2vec2.setup import load_wav2vec2_model
import torch
model = load_wav2vec2_model("wav2vec2_large_ls960", device=torch.device("cuda:0"), dtype=torch.float32)
print(model)
Output: P1230849916
Fixes #{issue number}
Does your PR introduce any breaking changes? If yes, please list them: List of all backwards-incompatible changes.
Check list:
- [ ] Was the content of this PR discussed and approved via a GitHub issue? (no need for typos or documentation improvements)
- [ ] Did you read the contributor guideline?
- [ ] Did you make sure that your PR does only one thing instead of bundling different changes together?
- [ ] Did you make sure to update the documentation with your changes? (if necessary)
- [ ] Did you write any new necessary tests?
- [ ] Did you verify new and existing tests pass locally with your changes?
- [ ] Did you update the CHANGELOG? (no need for typos, documentation, or minor internal changes)