Training on a custom dataset
Hello, I was reading the training instructions (and the prepare dataset scripts) and I don't understand how you'd create and use custom datasets with this model.
@korakoe
you can skip lhotse download and lhotse prepare, but you must prepare the lhoste manifests files like https://github.com/lhotse-speech/lhotse/blob/master/lhotse/recipes/libritts.py#L84
more examples: https://github.com/lhotse-speech/lhotse/tree/master/lhotse/recipes
@lifeiteng this was very much helpfull. Thanks for it!!
Is this used for transcribing? You can just use whisper to transcribe most languages. Then convert it into hdf5 file and texts and feed it via dynamic batch samppler