RuntimeError: mat1 and mat2 shapes cannot be multiplied (1000x768 and 4096x256)
具体是哪里报错呢,可以有有详细的运行日志嘛
是因为大模型的底座没修改,llama的最大长度是768,参考主函数的llm_dim维度参数
运行时错误:mat1 和 mat2 形状无法相乘(1000x768 和 4096x256),
请问应该怎么改呢
0it [00:00, ?it/s]
Traceback (most recent call last):
File "/home/gf-shu/wsb/Time-LLM-main/run_main.py", line 260, in
outputs = model(batch_x, batch_x_mark, dec_inp, batch_y_mark)
File "/home/gf-shu/anaconda3/envs/time_llm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/gf-shu/anaconda3/envs/time_llm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/home/gf-shu/anaconda3/envs/time_llm/lib/python3.10/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn
ret_val = func(*args, **kwargs)
File "/home/gf-shu/anaconda3/envs/time_llm/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 1852, in forward
loss = self.module(*inputs, **kwargs)
File "/home/gf-shu/anaconda3/envs/time_llm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/gf-shu/anaconda3/envs/time_llm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/home/gf-shu/wsb/Time-LLM-main/models/TimeLLM.py", line 197, in forward
dec_out = self.forecast(x_enc, x_mark_enc, x_dec, x_mark_dec)
File "/home/gf-shu/wsb/Time-LLM-main/models/TimeLLM.py", line 242, in forecast
enc_out = self.reprogramming_layer(enc_out, source_embeddings, source_embeddings)
File "/home/gf-shu/anaconda3/envs/time_llm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/gf-shu/anaconda3/envs/time_llm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/home/gf-shu/wsb/Time-LLM-main/models/TimeLLM.py", line 287, in forward
source_embedding = self.key_projection(source_embedding).view(S, H, -1)
File "/home/gf-shu/anaconda3/envs/time_llm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/gf-shu/anaconda3/envs/time_llm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/home/gf-shu/anaconda3/envs/time_llm/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 116, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (1000x768 and 4096x1024)
Process finished with exit code 1
我现在也面临这个问题,我的是1000x768和4096x1024,不知道修改哪里
运行时错误:mat1 和 mat2 形状无法相乘(1000x768 和 4096x256), 请问应该怎么改呢
我现在也面临这个问题,我的是1000x768和4096x1024,不知道修改哪里
请问具体的报错日志是什么呢?看起来似乎是base model的dimension不一致导致的,请注意对于不同的base model而言这个设置是不一样的,参考readme中llama是4086,而gpt2是768。
运行时错误:mat1 和 mat2 形状无法相乘(1000x768 和 4096x256), 请问应该怎么改呢
我现在也面临这个问题,我的是1000x768和4096x1024,不知道修改哪里
请问具体的报错日志是什么呢?看起来似乎是base model的dimension不一致导致的,请注意对于不同的base model而言这个设置是不一样的,参考readme中llama是4086,而gpt2是768。
是的,和上边朋友报错的一样,我按照您说的修改了维度,现在已经可以正常训练了,很感谢您的回复!