Rui Lu
Rui Lu
Thank you for reaching out with your query. 1. AgentInstruct is a curated dataset used to **fine-tune** the model; if you need to evaluate the model, you can refer to...
Your output seems like there may be a mismatch in the evaluation setup you've used. Please ensure that you're using the evaluation code from `./AgentBench.old` as mentioned in README, not...
As mentioned in https://github.com/THUDM/AgentTuning#held-in-tasks > The 6 held-in tasks are selected from [AgentBench](https://github.com/THUDM/AgentBench). However, since AgentBench is still under active development, the results from the latest branch might not fully...