sunzhufeng12345
sunzhufeng12345
而且如果我把stage3.json中的 "stage3_prefetch_bucket_size": "auto",改为 "stage3_prefetch_bucket_size": 15099494,运行会出现如下错误: [2024-08-26 10:00:37,155] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2024-08-26 10:00:37,222] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2024-08-26 10:00:37,235] [INFO] [real_accelerator.py:203:get_accelerator] Setting...
> 我还遇到了这个: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/lib/python3.11/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn ret_val = func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/lib/python3.11/site-packages/deepspeed/runtime/engine.py", line 1855, in forward loss = self.module(*inputs, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in...
> 我们目前提供的`GLM-4-9B`模型训练代码需要`transformers==4.33.0`的环境,更高的transformers环境可能导致错误。为了支持packing training,请用`patch/`下提供的[modeling_chatglm.py](https://github.com/THUDM/LongWriter/blob/main/train/patch/modeling_chatglm.py)文件替换原始模型的`modeling_chatglm.py`. 目前已经换成4.33.0,而且modeling_chatglm.py也已替换,但是出现如下报错: rank7]: Traceback (most recent call last): [rank7]: File "/home/hnjj/diskdata/yuanshi/media/szf/llm/glm_longwrite/LongWriter/train/main.py", line 130, in [rank7]: train() [rank7]: File "/home/hnjj/diskdata/yuanshi/media/szf/llm/glm_longwrite/LongWriter/train/main.py", line 110, in train [rank7]: model = AutoModelForCausalLM.from_pretrained(model_args.model_name_or_path, [rank7]:...
> 你这里应该是没有成功替换,我们训练时的[modeling_chatglm.py](https://github.com/THUDM/LongWriter/blob/main/train/patch/modeling_chatglm.py)代码中没有这一行:File "/home/hnjj/.cache/huggingface/modules/transformers_modules/glm-4-9b-chat/modeling_chatglm.py", line 416, in init [rank7]: self.core_attention = CORE_ATTENTION_CLASSES[config._attn_implementation](config, self.layer_number)。这是原始hf库中的代码才有的。 我试了确实是,替换了原来的文件后,运行train文件,就会使用的还是原来的modeling_chatglm.py文件