xd2333 issues

Results 3 issues of


                                            xd2333

different inference result

hi unslothai, i got different inference result when using unsloth, i'v tested qwen1.5-chat and tinyllama-chat and got same issue, generate by unsloth always get a bad result compare with transformers...

currently fixing

【提示】transformers>=4.43.0小显存训练时不加入以下参数容易导致显存累积，直至爆显存/OOM

### Reminder - [X] I have read the README and searched the existing issues. ### System Info transformers>=4.43.0 ### Reproduction （这个设置是给在单卡或低显存环境下微调用的，特征是刚开始微调时不爆显存，训练一些步数后VRAM逐渐累积直至oom）踩坑许久，今日发现需在配置文件加入以下行： ```yaml torch_empty_cache_steps: 1 # 任意可整除global_step的值 ``` transformers新增加的参数torch_empty_cache_steps，在不设置时不会自动清除cache，微调时导致VRAM峰值一直累积直至OOM： https://github.com/huggingface/transformers/blob/68049b17a6bb4c9b0d499e9e77121effa2f5a6c0/src/transformers/training_args.py#L876...

pending

使用Qwen/Qwen2.5-1.5B-Instruct + lora 训练RM模型时合并报错

版本：代码仓库安装报错： ``` Loading the LoRA adapter from [./checkpoint/Qwen2.5-1.5B-rm](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/c/openrlhf/checkpoint/Qwen2.5-1.5B-rm) Traceback (most recent call last): File "/home/c/miniconda3/envs/py310/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/c/miniconda3/envs/py310/lib/python3.10/runpy.py", line 86, in _run_code...