Support `BatchFeature` in `LengthGroupedSampler` for Multimodal compatibility

Open npurson opened this issue 1 month ago • 0 comments

Feature request

I am currently fine-tuning a multimodal model (Qwen2.5-VL) using the official Trainer. The training fails during the dataset length inference step in LengthGroupedSampler because the code strictly checks for dict or BatchEncoding, but multimodal processors often return BatchFeature.

Specifically, the following check raises a ValueError:

https://github.com/huggingface/transformers/blob/471d7ce9abbb3bc1b3bab673367378f9dbc3caac/src/transformers/trainer_pt_utils.py#L471

https://github.com/huggingface/transformers/blob/471d7ce9abbb3bc1b3bab673367378f9dbc3caac/src/transformers/trainer_pt_utils.py#L531

Motivation

As above.

Your contribution

Simply adding BatchFeature in the type check.

Dec 10 '25 07:12 npurson