Lalaramarya
Lalaramarya
@tlikhomanenko, I am also facing the same problem. Can you please suggest a solution to this?
Hi @cwx-worst-one, thank you very much for your prompt reply Can you tell, for fine-tuning on new languages, min how much data is required? Currently, I have been training it...
@cwx-worst-one Thank you very much for your prompt reply Yes, both model supports my new languages, on which I am fine-tuning. Can you tell me the minimum amount of data...
@cwx-worst-one Thank you very much I will try out the semantic speech tokenizer you suggested with a 100k sample.
@cwx-worst-one Hi **I am replicating the model with VoiceAssistant-400K-SLAM-Omni with the following settings** train_config: {'model_name': 's2s', 'enable_ddp': False, 'enable_deepspeed': False, 'enable_fsdp': False, 'low_cpu_fsdp': False, 'run_validation': True, 'batch_size_training': 1, 'batching_strategy': 'custom',...
@cwx-worst-one Thank you for your reply **Currently, by changing code_layer=1, the inference generates the result, but I am still facing the issues **Audio token is too long, skip. You can...
@cwx-worst-one Thank you very much, let me try with group 3. Is there any optimal number for deciding the number of groups? Why 3? how this will affect? do_layershift=false the...
@cwx-worst-one Thank you for your reply **Could you please tell, do_sample=false, what it will do if true?** **As you are using Scheduler type: LambdaLR for Learning rate, is it necessary,...
@cwx-worst-one Thank You Can you tell me how I can generate the semantic tokens from Cosysvoice? Which code or part of the code can generate the tokens from the target...
@cwx-worst-one Thank you very much, I will check. Can you tell me one more thing, where in the code of SLAM s2s Historical text prompting is initializing, and if I...