smartliuhw
smartliuhw
how can i launch stage-2 training without using axolotl? can i just comment these lines in the [train_code](https://github.com/FasterDecoding/Medusa/blob/main/medusa/train/train_legacy.py) and save the whole model? https://github.com/FasterDecoding/Medusa/blob/5e980538695096e7e372c1e27a6bcf142bfeab11/medusa/train/train_legacy.py#L346-L348
我想要评测游戏的三个任务,在配置了``start_task.yaml``后,一直显示task不存在,麻烦问一下可以怎么解决 报错: start_task.yaml ```yaml definition: import: tasks/task_assembly.yaml start: # dbbench-std: 5 # os-std: 5 cg-std: 5 alfworld-std: 5 ltp-std: 5 ``` default.yaml ```yaml import: definition.yaml concurrency: task: # dbbench-std: 5...
### Feature request / 功能建议 论文中提到会公开的profile数据,希望获得并学习 ### Motivation / 动机 为了学术研究 ### Your contribution / 您的贡献 暂无
I was trainging Gemma-2B and Gemma-7B using sfttrainer, with `packing=True` set. The Gemma-2B's loss was quite normal, but Gemma-7B's was abnormally high. I'm not sure why this would happen, since...
### System Info trl-0.8.6 transformers-4.41.2 ### Information - [ ] The official example scripts - [X] My own modified scripts ### Tasks - [ ] An officially supported task in...