transformers
transformers copied to clipboard
Support `BatchFeature` in `LengthGroupedSampler` for Multimodal compatibility
Feature request
I am currently fine-tuning a multimodal model (Qwen2.5-VL) using the official Trainer. The training fails during the dataset length inference step in LengthGroupedSampler because the code strictly checks for dict or BatchEncoding, but multimodal processors often return BatchFeature.
Specifically, the following check raises a ValueError:
https://github.com/huggingface/transformers/blob/471d7ce9abbb3bc1b3bab673367378f9dbc3caac/src/transformers/trainer_pt_utils.py#L471
https://github.com/huggingface/transformers/blob/471d7ce9abbb3bc1b3bab673367378f9dbc3caac/src/transformers/trainer_pt_utils.py#L531
Motivation
As above.
Your contribution
Simply adding BatchFeature in the type check.