Unable to download the dataset from huggingface
I tried to use huggingface to download the dataset
from datasets import load_dataset
dataset = load_dataset("liminghao1630/API-Bank")
but it gave me the error
An error occurred while generating the dataset
ValueError: Couldn't cast
input: string
file: string
id: int64
expected_output: string
instruction: string
to
{'input': Value(dtype='string', id=None), 'instruction': Value(dtype='string', id=None), 'output': Value(dtype='string', id=None)}
because column names don't match
I also tried to download the json file directly to my local machine, but when reading the files, the input format is like json format instead of a text string.
For example, this is the first input instruction.
"\nGenerate an API request in the format of [ApiName(key1='value1', key2='value2', ...)] based on the previous dialogue context.\nThe current time is 2039-03-09 18:56:09 Wednesday.\nInput: \nUser: User's utterence\nAI: AI's response\n\nExpected output:\nAPI-Request: [ApiName(key1='value1', key2='value2', ...)]\n\nAPI descriptions:\n"
How do i load the dataset properly?
Thank you for your help
@XuanRen4470
Yes, please directly download them and load them by json.
You can refer to the evaluator_by_json.py for the loading code.
Hi Minghao, Thanks for the great work! Could you please share some results and examples on how to evaluate existing models on your benchmark like how to get the results for GPT-3.5? So that it would be easy for others to perform an apple-to-apple comparison on the benchmark.