ms-swift 在inference的时候指定--max_length 4096但是似乎没有起到任何作用

Describe the bug What the bug is, and how to reproduce, better with screenshots(描述bug以及复现过程，最好有截图)

在inference的时候指定--max_length 4096但是似乎没有起到任何作用，报错长度超过4096，按理说超过4096的会被去除掉呀 Traceback (most recent call last): File "/mnt/sdc/wzy/ms-swift/swift/cli/infer.py", line 5, in infer_main() File "/mnt/sdc/wzy/ms-swift/swift/llm/infer/infer.py", line 243, in infer_main return SwiftInfer(args).main() File "/mnt/sdc/wzy/ms-swift/swift/llm/base.py", line 47, in main result = self.run() File "/mnt/sdc/wzy/ms-swift/swift/llm/infer/infer.py", line 85, in run result = self.infer_dataset() File "/mnt/sdc/wzy/ms-swift/swift/llm/infer/infer.py", line 226, in infer_dataset resp_list = self.infer( File "/mnt/sdc/wzy/ms-swift/swift/llm/infer/infer_engine/pt_engine.py", line 542, in infer res += self._infer( File "/home/wzy/anaconda3/envs/minicpm-o/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) File "/mnt/sdc/wzy/ms-swift/swift/llm/infer/infer_engine/pt_engine.py", line 468, in _infer batched_inputs, error_list = self._batch_encode( File "/mnt/sdc/wzy/ms-swift/swift/llm/infer/infer_engine/infer_engine.py", line 287, in _batch_encode batched_inputs.append(future.result()) File "/home/wzy/anaconda3/envs/minicpm-o/lib/python3.10/concurrent/futures/_base.py", line 451, in result return self.__get_result() File "/home/wzy/anaconda3/envs/minicpm-o/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result raise self._exception File "/home/wzy/anaconda3/envs/minicpm-o/lib/python3.10/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) File "/home/wzy/anaconda3/envs/minicpm-o/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) File "/mnt/sdc/wzy/ms-swift/swift/llm/template/base.py", line 414, in encode raise MaxLengthError(f'Current length of row({length}) is larger' swift.llm.template.base.MaxLengthError: Current length of row(4810) is larger than the max_length(4096).

Your hardware and system info Write your system info like CUDA version/system/GPU/torch version here(在这里给出硬件信息和系统信息，如CUDA版本，系统，GPU型号和torch版本等)

CUDA_VISIBLE_DEVICES=2,3 swift infer --model Qwen/Qwen2.5-VL-7B-Instruct --val_dataset our_dataset.jsonl --max_length 4096 GPU:4090

Additional context Add any other context about the problem here(在这里补充其他信息)

Apr 23 '25 10:04 zhaoyangwei123

推理时目前没做这方面的逻辑

后面会优化

Apr 24 '25 02:04 Jintao-Huang

遇到同样的问题

May 06 '25 07:05 TSLNIHAOGIT

同样的问题，有修改好了吗

Sep 11 '25 09:09 faker52

swift.llm.template.base.MaxLengthError: Current length of row(103074) is larger than the max_length(40960).我也遇到了，Qwen3-4B没有，8B模型就有这个问题，说明是输出长度超长？还没截断？

Sep 23 '25 08:09 HqWei