eval-scope icon indicating copy to clipboard operation
eval-scope copied to clipboard

opencompass 支持的数据集比较多,但evalscope使用opencompass作为后端,有些数据集却不支持

Open weibingo opened this issue 1 year ago • 2 comments

问题描述 / Issue Description

请简要描述您遇到的问题。 / Please briefly describe the issue you encountered.

使用的工具 / Tools Used

  • [ ] Native / 原生框架
  • [yes ] Opencompass backend
  • [ ] VLMEvalKit backend
  • [ ] RAGEval backend
  • [ ] Perf / 模型推理压测工具
  • [ ] Arena /竞技场模式

Opencompass官方支持一些长文本的数据集,例如LEval、LongBench,但evalscope却不支持。是否我把数据集下载放到data目录下,就可以直接使用? opencompass都支持了,你们是做了另外的适配么?

其他信息 / Additional Information

如果有其他相关信息,请在此处提供。 / If there is any other relevant information, please provide it here.

weibingo avatar Dec 30 '24 11:12 weibingo

原因是早期接入OC的版本,是统一通过openai api接口的方式(所以目前才可以通过大部分的推理框架,如vllm、lmdeploy等启拉起推理服务然后评测),而当时接入时候支持的benchmark有限,且相对于local infer的方式还需要做验证是否能够对齐。我们计划基于OC目前重构后的版本(已经对openai api支持较为成熟),尝试接入全量的数据集。

wangxingjun778 avatar Dec 30 '24 12:12 wangxingjun778

现在如何支持呢,然后也添加了一些适配code,但是也不行。 在eval_dataset.py中添加了 from opencompass.configs.datasets.longbench.longbench import longbench_datasets from opencompass.configs.datasets.leval.leval import leval_datasets

通过OpenCompassBackendManager.list_datasets() 打印出来是有longbench,但执行run不行,报错如下:

File "/usr/local/python3/lib/python3.10/site-packages/opencompass/datasets/longbench/longbench_hotpot_qa.py", line 17, in load dataset = load_dataset(**kwargs) File "/usr/local/python3/lib/python3.10/site-packages/datasets/load.py", line 2074, in load_dataset builder_instance = load_dataset_builder( File "/usr/local/python3/lib/python3.10/site-packages/datasets/load.py", line 1795, in load_dataset_builder dataset_module = dataset_module_factory( File "/usr/local/python3/lib/python3.10/site-packages/datasets/load.py", line 1671, in dataset_module_factory raise e1 from None File "/usr/local/python3/lib/python3.10/site-packages/datasets/load.py", line 1591, in dataset_module_factory raise ConnectionError(f"Couldn't reach '{path}' on the Hub ({e.class.name})") from e ConnectionError: Couldn't reach 'THUDM/LongBench' on the Hub (ConnectionError)

weibingo avatar Jan 07 '25 09:01 weibingo