bchinnari
bchinnari
HI, can somebody clarify ?
Thanks for the response. I had that question because of the following reason I am running VAD on a test file as follows ``` python examples/asr/speech_classification/vad_infer.py --config-path="../conf/vad" --config-name="vad_inference_postprocessing.yaml" dataset=test.json ```...
Is this possible ? Did anyone observe this ?
Ok. Here is what I did. I took a pretrained HF model (https://huggingface.co/vasista22/whisper-hindi-small) and fine-tuned it using my data. Then I converted the checkpoint to faster-whisper format. If I use...
when "word_timestamps=False", the output is as follows ` Segment(id=1, seek=600, start=0.0, end=6.0, text='सितम्बर 19', tokens=[50364, 45938, 33279, 36158, 48521, 27099, 3941, 105, 25411, 1294], temperature=0.0, avg_logprob=-0.17912933772260492, compression_ratio=0.6857142857142857, no_speech_prob=1.3633834695708693e-14, words=None) `...
Kindly clarify one last thing. This is very important for my work. 1. Is Coqui XTTS-v2 commercially usable? 2. could you suggest a tts system which has voice cloning feature...