audio icon indicating copy to clipboard operation
audio copied to clipboard

About HUBERT_ASR_BASE pipeline

Open LYPinASR opened this issue 2 years ago • 3 comments

🚀 The feature

Hello!

I want to use the finetuned HuBERT_base model. However, in torchaudio.pipelines, there has only HUBERT_ASR_LARGE and HUBERT_ASR_XLARGE. What should I do to get a HUBERT_ASR_BASE model?

Motivation, pitch

the finetuned HuBERT_base model

Alternatives

No response

Additional context

No response

LYPinASR avatar Apr 03 '23 08:04 LYPinASR

Hi @LYPinASR, there is no HUBERT_ASR_BASE model because fairseq didn't release the model weights. What is your use case for the Base model, and which dataset (10 minute, 1 hour, 10 hours, or 960 hours) do you want the Base model to be finetuned?

nateanl avatar Apr 03 '23 20:04 nateanl

Thanks for your reply! I want to use the base model before and after finetuning to compare the output representation of each layer. As for the dataset, 10h is perpect. Best!

---- Replied Message ---- | From | @.> | | Date | 04/04/2023 04:02 | | To | @.> | | Cc | @.>@.> | | Subject | Re: [pytorch/audio] About HUBERT_ASR_BASE pipeline (Issue #3229) |

Hi @LYPinASR, there is no HUBERT_ASR_BASE model because fairseq didn't release the model weights. What is your use case for the Base model, and which dataset (10 minute, 1 hour, 10 hours, or 960 hours) do you want the Base model to be finetuned?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

LYPinASR avatar Apr 03 '23 22:04 LYPinASR

I see. Thanks for sharing the context. I think you can run the finetuning script (in https://github.com/pytorch/audio/blob/main/examples/hubert/finetune.py) by loading the HUBERT_BASE model weight. The finetuning recipe is designed for exact 10 hours of Libri-Light dataset, and it only requires 1 GPU.

nateanl avatar Apr 04 '23 19:04 nateanl