Rui Lu comments

Repositories
Issues
Comments

Results 3 comments of


                                            Rui Lu

Can I run AgentInstruct data on the AgentBench?

Thank you for reaching out with your query. 1. AgentInstruct is a curated dataset used to **fine-tune** the model; if you need to evaluate the model, you can refer to...

AgentTuning 7b evaluate in HH， not expect as paper result

Your output seems like there may be a mismatch in the evaluation setup you've used. Please ensure that you're using the evaluation code from `./AgentBench.old` as mentioned in README, not...

AgentTuning 7b evaluate in HH， not expect as paper result

As mentioned in https://github.com/THUDM/AgentTuning#held-in-tasks > The 6 held-in tasks are selected from [AgentBench](https://github.com/THUDM/AgentBench). However, since AgentBench is still under active development, the results from the latest branch might not fully...