asusdisciple
asusdisciple
I stumbled upon this thread after benchmarking insanely-fast-whisper, seamless-m4t-v2, faster-whisper and the hugginface implementation of whisper based on transformers pipeline with bettertransformers. I found a bug related to the this...
Yeah I found a high level explanation, but to my knowledge it is still necessary to train all the models because the codebook apparently does not exist anymore. I do...
Ah I see I found the repo I think. I have access to a few a100 and will try to train and/or fine tune the tortoise model on my native...
Got the same error, when trying to compute a confusion matrix on a `callback` when I call `metric.plot()`: ``` def on_test_epoch_end(self) -> None: metric = MulticlassConfusionMatrix(num_classes=self.num_classes).to("cpu") outputs = torch.cat(self.x_test, dim=0).to("cpu")...
Oh I see thanks for the clarification. However is there any way to set the target language to the source language for ASR? For example when I do not want...
Also if I may add: Unfortunately none of the MMS versions includes all of the languages in m4t-v2. These languages are not supported by MMS, which makes it kinda hard...
Would also be interested.
I would like to know this as well. How can we set the target language to source language in M4Tv2? For Audio you often dont know the language. Is there...
Thanks for your fast answer! This makes sense in a way. If character length influences the result my question would be, how does the model behave if the chunk is...
Not really, I use Ubuntu with an A100 and 80GB of RAM