autogen Function calling reflection error

What happened?

path:venv\Lib\site-packages\autogen_agentchat\agents_assistant_agent.py RuntimeError("Reflect on tool use produced no valid text response.") When "reflection_result.content" is a Function_calls LIST. The situation occurs in consecutive multiple tool calls.

For example Task prompt: "Please tell me a story about Little Red Riding Hood and draw it in five scenes"

print(reflection_result) ------------------ Reflection1 Result ------------------ finish_reason='function_calls' content=[FunctionCall(id='call_30151179', arguments='{"prompt":"some scenery description"}', name='txt2img_tool')] usage=RequestUsage(prompt_tokens=0, completion_tokens=0) cached=False logprobs=None thought='scenery 1 result image: img. Generating scenery 2...' ------------------ Reflection1 end ------------------ reflection_result.content is a FunctionCall List so it raise a runtime error.

Model client info

The error appears in Gimini-2.5-pro/grok4. In these powerful model, reflection can response string and functioncall in a the same time. NOT just a str.

Demo code

import asyncio
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.teams import SelectorGroupChat
from autogen_agentchat.conditions import TextMentionTermination
from autogen_agentchat.ui import Console
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_agentchat.conditions import TextMentionTermination,HandoffTermination,MaxMessageTermination


def txt2img_tool(prompt: str) -> str:
    """
    A tool that returns a mock image.
    """
    print(f"Generating image with prompt: {prompt}")
    return "https://www.baidu.com/img/PCtm_d9c8750bed0b3c7d089fa7d55720d6cf.png"

async def main() -> None:
    END_NPRMAL_STR = "</ED>" #finished flag
    text_normal_termination = TextMentionTermination(END_NPRMAL_STR)
    max_msg_termination = MaxMessageTermination(max_messages=12)
    handoff_termination = HandoffTermination(target="user")
    termination = max_msg_termination | text_normal_termination | handoff_termination

    model_client = OpenAIChatCompletionClient(
        model="gemini-2.5-pro",
        base_url="xxx",
        api_key="xxx",
        model_info= {
            "vision": True,
            "function_calling": True,
            "json_output": True,
            "structured_output": True,
            "family": "gemini-2.5-pro"
        }
    )
    # Daily assistant agent
    daily_agent = AssistantAgent(
        name="daily_agent",
        description="A helpful assistant for daily conversations and general questions.",
        system_message="You are a helpful assistant for daily chat.",
        model_client=model_client,
    )

    # Drawing agent
    drawing_agent = AssistantAgent(
        name="drawing_agent",
        description="An agent that can draw images based on prompts.",
        system_message=f"""1.You are an AI drawing assistant. You must use the txt2img_tool to create images.
        3.When you are asked to draw, use the tool.and output markdown picture of result."
        4.when you completed user's task,output this str in the end：{END_NPRMAL_STR}""",
        tools=[txt2img_tool],
        reflect_on_tool_use=True,
        model_client=model_client
    )

    group_chat = SelectorGroupChat(
        [daily_agent, drawing_agent],
        termination_condition=termination,
        model_client=model_client,
        allow_repeated_speaker=True
    )

    # Start the conversation
    await Console(group_chat.run_stream(task="将桃花源记分5个段落讲述，每个段落配图1个"))  # You can change the task here

    await model_client.close()

if __name__ == "__main__":
    asyncio.run(main())

Which packages was the bug in?

Python AgentChat (autogen-agentchat>=0.4.0)

AutoGen library version.

Python 0.6.4

Other library version.

No response

Model used

No response

Model provider

None

Other model provider

No response

Python version

None

.NET version

None

Operating system

None

Jul 19 '25 05:07 inspire-boy

Don't use reflect_on_tool_use in this case, and set max_tool_iterations to some high number like 10.

Looks like Gemini and Grok sometimes don't respect the tool_choice parameter, even though they supposedly implement the open ai compatible endpoint.

Jul 19 '25 07:07 ekzhu

Don't use reflect_on_tool_use in this case, and set max_tool_iterations to some high number like 10.

Looks like Gemini and Grok sometimes don't respect the tool_choice parameter, even though they supposedly implement the open ai compatible endpoint.

Thanks for reply! Can we also be compatible with this response method at the code level? I think The older versions of these models such as gemini-pro-05-06,deprecated/grok3, when worked with reflection, have very stable effects.When the tool calls error.reflection may helps. It output more friendly. I guess the new version models definitely have more possibilities.

Jul 19 '25 11:07 inspire-boy