Describe the bug
What the bug is, and how to reproduce, better with screenshots(描述bug以及复现过程,最好有截图)
在inference的时候指定--max_length 4096但是似乎没有起到任何作用,报错长度超过4096,按理说超过4096的会被去除掉呀
Traceback (most recent call last):
File "/mnt/sdc/wzy/ms-swift/swift/cli/infer.py", line 5, in
infer_main()
File "/mnt/sdc/wzy/ms-swift/swift/llm/infer/infer.py", line 243, in infer_main
return SwiftInfer(args).main()
File "/mnt/sdc/wzy/ms-swift/swift/llm/base.py", line 47, in main
result = self.run()
File "/mnt/sdc/wzy/ms-swift/swift/llm/infer/infer.py", line 85, in run
result = self.infer_dataset()
File "/mnt/sdc/wzy/ms-swift/swift/llm/infer/infer.py", line 226, in infer_dataset
resp_list = self.infer(
File "/mnt/sdc/wzy/ms-swift/swift/llm/infer/infer_engine/pt_engine.py", line 542, in infer
res += self._infer(
File "/home/wzy/anaconda3/envs/minicpm-o/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/mnt/sdc/wzy/ms-swift/swift/llm/infer/infer_engine/pt_engine.py", line 468, in _infer
batched_inputs, error_list = self._batch_encode(
File "/mnt/sdc/wzy/ms-swift/swift/llm/infer/infer_engine/infer_engine.py", line 287, in _batch_encode
batched_inputs.append(future.result())
File "/home/wzy/anaconda3/envs/minicpm-o/lib/python3.10/concurrent/futures/_base.py", line 451, in result
return self.__get_result()
File "/home/wzy/anaconda3/envs/minicpm-o/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/home/wzy/anaconda3/envs/minicpm-o/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/wzy/anaconda3/envs/minicpm-o/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/mnt/sdc/wzy/ms-swift/swift/llm/template/base.py", line 414, in encode
raise MaxLengthError(f'Current length of row({length}) is larger'
swift.llm.template.base.MaxLengthError: Current length of row(4810) is larger than the max_length(4096).
Your hardware and system info
Write your system info like CUDA version/system/GPU/torch version here(在这里给出硬件信息和系统信息,如CUDA版本,系统,GPU型号和torch版本等)
CUDA_VISIBLE_DEVICES=2,3 swift infer --model Qwen/Qwen2.5-VL-7B-Instruct --val_dataset our_dataset.jsonl --max_length 4096
GPU:4090
Additional context
Add any other context about the problem here(在这里补充其他信息)
swift.llm.template.base.MaxLengthError: Current length of row(103074) is larger than the max_length(40960).我也遇到了,Qwen3-4B没有,8B模型就有这个问题,说明是输出长度超长?还没截断?