Guochen Yan comments

Results 6 comments of


                                            Guochen Yan

Question about reproducing test accuracy on citation datasets

Thanks for your reply. yes I have checked the issues but I only found @EdisonLeeeee's experiments results, which are based on wrong codes (I guess) according to the following comments....

[Bug] on agieval_gen_617738, pyarrow.lib.ArrowInvalid: Could not convert '$5$;$10$' with type str: tried to convert to int64

I **did encounter** the same error when using agieval_mixed. Sorry for the misleading comment, I forgot to remove it.

[Bug] 缺乏game24以及其他数据集

Lots of datasets are missing. I met the same problems in 1) FinanceIQ, 2) LawBench, 3) MedBench/DrugCA, 4) AGIEval/data/v1/ Could you please fix it?

[Bug] Medbench dataset only provides test data, not the entire dataset

Same problem. Have you solved it?

[Bug] 无法使用多卡评测

一样的问题，仅设置num_gpus=2没法用两卡。相反，如果始终开启--debug模式，倒是会自动平均到两张卡上

> > 请问这个问题还有人在跟进吗？ > > > 一样的问题，仅设置num_gpus=2没法用两卡。相反，如果始终开启--debug模式，倒是会自动平均到两张卡上 > > hello, 我在昇腾卡解决了这个, 你如果是用vllm跑的话就可以多卡运行。如果是hf模式，记得去models/下的两个hf model.py，里面有一句if is_npu_available(), 它会指定device='npu', 就是这个导致的设置多卡运行不起来。解决方法是注释掉这句, 在最上from torch_npu.contrib import transfer_to_npu 感谢，我也有在用npu，但还没在npu上试过，目前是在2卡3090上（无vllm）跑评测（mbpp这种单卡会oom），然后设置num_gpus=2不起作用，还是会在第一张卡上加载直到OOM. 但是如果我打开debug模式，似乎就可以2卡并行。 BTW，可否加个联系方式，NPU上我也有一些问题和经验，希望能一起交流。我的email是[email protected]