hjc3613
hjc3613
train on 30000000 parallel english and chinese sentences, after trained 100000 steps, the predit result:  content of config.yaml: # wmt14_en_de.yaml save_data: data/wmt/run/example # Corpus opts: data: corpus_1: path_src: corpus/corpus_1/en-zh.en.bpe...
您好,请问能提供nyt10数据集的json格式下载连接吗?官网下的都是pb文件
微调后,没有生成adaptor_config.json文件,也没有config.json文件,所以在合并时,报错了:Can't find adapter_config.json - [√ ] **基础模型**: Alpaca-Plus - [ √] **运行系统**: Linux - [ √] **问题分类**:模型转换和合并 / 模型训练与精调 / - [ √] (必选)由于相关依赖频繁更新,请确保按照[Wiki](https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki)中的相关步骤执行 - [ √] (必选)我已阅读[FAQ章节](https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki/常见问题)并且已在Issue中对问题进行了搜索,没有找到相似问题和解决方案 - [...
### 详细描述问题 执行 run_sft.sh时,报错:  注:我用的deepspeed stage 3,配置项:  run_sft.sh脚本内容为:  run_sft.sh与原始内容的差异在于我去掉了peft_path,让模型重新初始化,其它的基本没变动 ### 必查项目(前三项只保留你要问的) - [ √] **基础模型**:Alpaca-Plus - [ √] **运行系统**:Linux - [ √] **问题分类**: 模型训练与精调 - [ √]...
### System Info Dockerfile:  ### Information - [ ] The official example scripts - [ ] My own modified scripts ### 🐛 Describe the bug when freezing the top...
**training 70B cost too large gpu memory** when training 70+B model, the cost of gpu memory is too large, as mentioned in the deepspeed deepspeed-readthedocs-io-en-stable.pdf, the total memory was about...
could you please provide details about the implementation of the trice algorithm? the code in https://github.com/google-research/cascades/blob/main/cascades/examples/notebooks/trice.ipynb has few functions unimplemented, I want to refer back to the original implementation, thanks...
## Is your feature request related to a problem? Please describe. (A clear and concise description of what the problem is.) when I generate 20s audio, the time cost is...
Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节) ## ❓ Questions and Help ### Before asking: 1. search the issues. 2. search the...