sensevoice 微调 加载数据出错
Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)
❓ Questions and Help
Before asking:
- search the issues.
- search the docs.
What is your question?
Code
What have you tried?
What's your environment?
- OS (e.g., Linux):
- FunASR Version (e.g., 1.0.0):
- ModelScope Version (e.g., 1.11.0):
- PyTorch Version (e.g., 2.0.0):
- How you installed funasr (
pip, source): - Python version:
- GPU (e.g., V100M32)
- CUDA/cuDNN version (e.g., cuda11.7):
- Docker version (e.g., funasr-runtime-sdk-cpu-0.4.1)
- Any other relevant information:
请问你解决了吗
Please tell me how to solve it, I also encountered the same problem @LauraGPT
你好,请问你解决了吗
请问有没有人解决了这个问题呀
mark,遇见了同样的问题
我使用sensevoice2jsonl生成的数据与自己设计的mock数据都会出现"data is empty"的问题,查看配置文件后发现是忽略了finetune.sh中的data_conf中的dataset_conf.batch_type="token"与dataset_conf.batch_size属性,误认为是按照样本分batch,将batch调大一些,问题就解决了。
我使用sensevoice2jsonl生成的数据与自己设计的mock数据都会出现"data is empty"的问题,查看配置文件后发现是忽略了finetune.sh中的data_conf中的dataset_conf.batch_type="token"与dataset_conf.batch_size属性,误认为是按照样本分batch,将batch调大一些,问题就解决了。
有用!