fairseq2 icon indicating copy to clipboard operation
fairseq2 copied to clipboard

Adding support for wav2vec2 large models.

Open kauterry opened this issue 1 year ago • 0 comments

What does this PR do? Please describe: Adds support for wav2vec2 large models trained on Librispeech 960h (ls960) and LibriVox 60k (lv60k).

Tested with this simple code snippet: lv60k

from fairseq2.models.wav2vec2.setup import load_wav2vec2_model
import torch

model = load_wav2vec2_model("wav2vec2_large_lv60k", device=torch.device("cuda:0"), dtype=torch.float32)
print(model)

Output: P1230849615

ls960:

from fairseq2.models.wav2vec2.setup import load_wav2vec2_model
import torch

model = load_wav2vec2_model("wav2vec2_large_ls960", device=torch.device("cuda:0"), dtype=torch.float32)
print(model)

Output: P1230849916

Fixes #{issue number}

Does your PR introduce any breaking changes? If yes, please list them: List of all backwards-incompatible changes.

Check list:

  • [ ] Was the content of this PR discussed and approved via a GitHub issue? (no need for typos or documentation improvements)
  • [ ] Did you read the contributor guideline?
  • [ ] Did you make sure that your PR does only one thing instead of bundling different changes together?
  • [ ] Did you make sure to update the documentation with your changes? (if necessary)
  • [ ] Did you write any new necessary tests?
  • [ ] Did you verify new and existing tests pass locally with your changes?
  • [ ] Did you update the CHANGELOG? (no need for typos, documentation, or minor internal changes)

kauterry avatar May 04 '24 21:05 kauterry