deyituo comments

Results 14 comments of


                                            deyituo

MelRoformer parameters from paper

@ZFTurbo how do you collect the songs?

example of tts(zero shot) on libritts?

en... I wonder the diff using much smaller dataset with less speakers?

what dataset you use for vocals_mdx23c model？

dataset is essential..

feat: support qwen_2_5_omni fine-tuning

可以支持t2s吗

@qthequartermasterman v0 engine cannot use some optimizations(enable_prefix_caching， enable_chunked_prefill， use_cudagraph, etc.), and is much slower than v1 pormpt tokens. When will the feature can be preview or PR?

[Usage]: embed prompts

@qthequartermasterman will this modifications support these features? https://github.com/vllm-project/vllm/compare/main...qthequartermasterman:vllm:inputs_embeds_in_v1_outline

[Usage]: embed prompts

@qthequartermasterman has a PR now？

hift两个版本的效果

微调在同样设置情况下，cv1和cv2的mel、f0微调曲线有较大差异 ![Image](https://github.com/user-attachments/assets/e84b2136-28f7-435c-9f7c-ac6417ca7e70) ``` # set random seed, so that you may reproduce your result. __set_seed1: !apply:random.seed [1986] __set_seed2: !apply:numpy.random.seed [1986] __set_seed3: !apply:torch.manual_seed [1986] __set_seed4: !apply:torch.cuda.manual_seed_all [1986] # fixed params sample_rate:...

deyituo

MelRoformer parameters from paper

example of tts(zero shot) on libritts?

bleeding from other track

what dataset you use for vocals_mdx23c model？

feat: support qwen_2_5_omni fine-tuning

Training and Inference Code

[Usage]: embed prompts

[Usage]: embed prompts

[Usage]: embed prompts

hift两个版本的效果