zerovl issues

Results 4 issues of


                                            zerovl

Support a sampling strategy for multiple training datasets

Proposing the debiased sampling method proposed in [the ZeroVL paper](https://arxiv.org/abs/2112.09331). When training multiple datasets, the debiased sampling improves the accuracy of CLIP model. It includes a new flag: * --debias-sample,...

Support decoupled gradient accumulation

Proposing the decoupled gradient accumulation~(DGA) method proposed in [the ZeroVL paper](https://arxiv.org/abs/2112.09331). DGA enables training CLIP model with a large batchsize but limited GPUs (e.g., 16,384 batchsize with 8V100 GPUs). It...

Result of Yi-6B-Chat on the BBH dataset cannot be reproduced

### Reminder - [X] I have searched the Github Discussion and issues and have not found anything similar to this. ### Motivation We tested Yi-6B-Chat on BBH and achieved the...

Qwen2-VL-7B的推理结果和OpenCompass榜单上的结果不一致

用下述代码测试了MMStar： `export exp_name=./Qwen2-VL-7B-Instruct export model_name=Qwen2-VL-7B CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 torchrun --master_port=25678 --nproc-per-node=8 run_rh2.py --data MMStar --model ${model_name} --model-path $exp_name --verbose` 本地结果： ![image](https://github.com/user-attachments/assets/c3dbed94-7aa3-429a-a98d-f9ab4d0b85d9) OpenCompass榜单结果： ![image](https://github.com/user-attachments/assets/76e7ade0-c50b-455f-b694-92d6b64098bc) 请问可能对不齐的原因有什么呀？这两个结果理论上是会对齐的对吗🤔