MooER icon indicating copy to clipboard operation
MooER copied to clipboard

MooER: Moore-threads Open Omni model for speech-to-speech intERaction. MooER-omni includes a series of end-to-end speech interaction models along with training and inference code, covering but not lim...

Results 4 MooER issues
Sort by recently updated
recently updated
newest added

如题; 数据:训练数据使用的是aishell, 模型:LLM模型是Qwen2.5 1.5B,encoder paraformer; 训练使用2张GPU; 模型只能训练一个epoch,执行第二个echo会报错:错误如下: ![Image](https://github.com/user-attachments/assets/081afe81-e975-44fc-9e14-85279e7d14f8) 当输出显示 : “2025-06-09 16:34:19 | INFO | mooer.utils.checpoint_io | checpoint_io.py:10 | Rank 1--> saving model ...” 时,会长时间停止; 此时GPU 利用率100%; 然后会报错,并退出;

w2v_bert2.0 做encoder训练300个step后会报数组越界的错,是因为训练音频长度问题吗? ``` 【2025-07-31 17:12:53】[2025-07-31 17:12:53,090] [INFO] [logging.py:107:log_dist] [Rank 0] step=100, skipped=0, lr=[6.666666666666668e-05], mom=[(0.9, 0.999)] 【2025-07-31 17:12:53】[2025-07-31 17:12:53,092] [INFO] [timer.py:264:stop] epoch=0/micro_step=200/global_step=100, RunningAvgSamplesPerSec=59.76838046856318, CurrSamplesPerSec=58.883498797084, MemAllocated=16.48GB, MaxMemAllocated=21.82GB 【2025-07-31 17:12:53】INFO:root:Training Epoch: 1/2, step...

作者您好,想跟您确认下,第二阶段的训练在第一阶段上新增了tts任务,输入为回复的文本,输出为对应的音频的encodec,那在训练第二阶段的同时,下图中的部分也要带着一起吗? 第三阶段训练也有同样的困惑,输入只有音频adapter,输出为回复音频的encodec?还需要带第二阶段的部分吗?

mooer-omni替换prompt_wav后音色切换失败,请问是哪里的问题?speaker encoder模型和vocoder模型是基于哪个开源模型训练的,可以自己重新训练吗?