AgentBench
AgentBench copied to clipboard
Any plans to add new models?
Hi there,
Thank you for the great contributions!
There have been many new models released since the benchmark was published. Do you have any plans to include some of these recent models, such as GPT-4o, Claude-3.5, Llama-3.1 405B, Mistral Large 2, DeepSeek V2, and others? Adding results from these models could provide significant value to the community!
Thanks
It seems they have added some new models here: https://fm.ai.tsinghua.edu.cn/superbench/#/leaderboard