Tommaso Cerruti comments

Repositories
Issues
Comments

Results 2 comments of


                                            Tommaso Cerruti

Plans to add new long-context benchmark like LongBench v2, Babilong, InfiniteBench datasets?

Hi @baberabb @jannalulu, I’d like to help by adding InfiniteBench to the evaluation tasks. I see it’s mentioned in this issue and partially covered in #3256, while BabiLong and LongBench...

Pass down ToolCall.id to the tool function

I can take this. I would go for approach 1 (pass the LLM tool call id directly) as an optional tool_call_id kwarg, injected only if the tool’s signature accepts it....