OpenAI Error code 400 intermitently in groupchats
What happened?
I have been experiencing intermittent errors in my group chat executions returning the following error:
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 1180, in a_initiate_chat
await self.a_send(msg2send, recipient, silent=silent)
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 800, in a_send
await recipient.a_receive(message, self, request_reply, silent)
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 951, in a_receive
reply = await self.a_generate_reply(sender=sender)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 2138, in a_generate_reply
final, reply = await reply_func(
^^^^^^^^^^^^^^^^^
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/autogen/agentchat/groupchat.py", line 1214, in a_run_chat
speaker = await groupchat.a_select_speaker(speaker, self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/autogen/agentchat/groupchat.py", line 574, in a_select_speaker
return await self.a_auto_select_speaker(last_speaker, selector, messages, agents)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/autogen/agentchat/groupchat.py", line 784, in a_auto_select_speaker
result = await checking_agent.a_initiate_chat(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 1173, in a_initiate_chat
await self.a_send(msg2send, recipient, request_reply=True, silent=silent)
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 800, in a_send
await recipient.a_receive(message, self, request_reply, silent)
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 951, in a_receive
reply = await self.a_generate_reply(sender=sender)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 2138, in a_generate_reply
final, reply = await reply_func(
^^^^^^^^^^^^^^^^^
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 1496, in a_generate_oai_reply
return await asyncio.get_event_loop().run_in_executor(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 1494, in _generate_oai_reply
return self.generate_oai_reply(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 1436, in generate_oai_reply
extracted_response = self._generate_oai_reply_from_client(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 1455, in _generate_oai_reply_from_client
response = llm_client.create(
^^^^^^^^^^^^^^^^^^
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/autogen/oai/client.py", line 777, in create
response = client.create(params)
^^^^^^^^^^^^^^^^^^^^^
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/autogen/oai/client.py", line 342, in create
response = completions.create(**params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/openai/_utils/_utils.py", line 275, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/openai/resources/chat/completions.py", line 581, in create
return self._post(
^^^^^^^^^^^
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/openai/_base_client.py", line 1233, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/openai/_base_client.py", line 922, in request
return self._request(
^^^^^^^^^^^^^^
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/openai/_base_client.py", line 1013, in _request
raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'error': {'message': "Invalid parameter: messages with role 'tool' must be a response to a preceeding message with 'tool_calls'.", 'type': 'invalid_request_error', 'param': 'messages.[2].role', 'code': `None}}
This has started happening since i upgraded from pyautogen==0.2.26 to autogen-agentchat==0.2.36
Is there any solution to this as it is making my service very unstable?
What did you expect to happen?
Not to error out intermittently due to bad tool calls
How can we reproduce it (as minimally and precisely as possible)?
I am running a GroupChat instance with a single user proxy agent and a number of conversable agents using async tools and have auto set for speaker selection method.
It seems to happen more when running multiple threads executing the groupchat at once but that may just be coincidence
I am using gpt4o mini
AutoGen version
0.2.36
Which package was this bug in
AgentChat
Model used
gpt4o-mini
Python version
No response
Operating system
No response
Any additional info you think would be helpful for fixing this bug
No response
can you provide a script so we can reproduce this? Without knowing how you set up the group chat it is very hard for us to understand the source of the error.
From your description and stack trace. It seems that the order of messages in the group chat manager might have been scrambled. A tool call should be followed by a tool call result message.
Hey @ekzhu thanks for getting back to me. Of course, below is a stripped down version of my implementation, all of the way things are defined are how i have it set up, ive just made it all generic. Let me know if you need anything else.
import asyncio
import random
import uuid
from typing import Annotated, List, Callable
from autogen import GroupChat, GroupChatManager, AssistantAgent, UserProxyAgent
from fastapi import Depends
from starlette.background import BackgroundTasks
from src.config import Settings, get_settings
from src.context.context_manager import ContextManager
from src.group_chat.utils.agent_utils import get_llm_config
from src.group_chat.workers import models as worker_models
from src.service_a import ServiceA
from src.service_b import ServiceB
from src.service_c import ServiceC
from src.logging.service import LoggingService
from src.logging_config import get_logger
from src.models import LLMModel
logger = get_logger(__name__)
# Example tool
async def example_tool(
param1: Annotated[str, "Description of param1"],
param2: Annotated[int, "Description of param2"]
) -> str:
return f"Processed {param1} with value {param2}"
def initialize_example_tool(tenant_id: str, context: ContextManager):
example_tool.tenant_id = tenant_id
example_tool.context = context
example_tool.description = "This is an example tool that processes a string and an integer."
return example_tool
# Tool initialization
def initialize_all_tools(
tenant_id: str,
context,
service_a: ServiceA,
service_b: ServiceB,
service_c: ServiceC,
) -> List[Callable]:
return [
initialize_example_tool(tenant_id, context),
# Add other tool initializations here
]
# Tool categorization
def get_tools_for_worker_a(all_tools: List[Callable]) -> List[Callable]:
return [tool for tool in all_tools if tool.__name__ in ["tool_a", "tool_b"]]
def get_tools_for_worker_b(all_tools: List[Callable]) -> List[Callable]:
return [tool for tool in all_tools if tool.__name__ in ["tool_c", "tool_d", "tool_e"]]
def get_tools_for_worker_c(all_tools: List[Callable], context: str = None) -> List[Callable]:
tools = [tool for tool in all_tools if tool.__name__ == "tool_f"]
if context is not None:
context_specific_tool = next((tool for tool in all_tools if tool.__name__ == "context_specific_tool"), None)
if context_specific_tool:
tools.append(context_specific_tool)
return tools
# Tool registration
def register_tools_for_agent(agent: AssistantAgent, tools: List[Callable]) -> AssistantAgent:
for tool in tools:
agent.register_for_llm(name=tool.__name__, description=tool.description)(tool)
return agent
def register_tools_for_user_proxy(user_proxy: UserProxyAgent, tools: List[Callable]) -> UserProxyAgent:
for tool in tools:
user_proxy.register_for_execution(name=tool.__name__)(tool)
return user_proxy
# Worker definitions
def get_worker_definitions(all_tools: List[Callable], context: str = None) -> dict[str, AssistantAgent | UserProxyAgent]:
llm_config = get_llm_config()
workers = {
worker_models.WorkerRole.worker_a.value: AssistantAgent(
name=worker_models.WorkerRole.worker_a.value,
system_message="You are Worker A. Your role is to perform tasks related to A.",
llm_config=llm_config,
),
worker_models.WorkerRole.worker_b.value: AssistantAgent(
name=worker_models.WorkerRole.worker_b.value,
system_message="You are Worker B. Your role is to perform tasks related to B.",
llm_config=llm_config,
),
# Add other worker definitions here
}
# Register tools for each worker
register_tools_for_agent(workers[worker_models.WorkerRole.worker_a.value], get_tools_for_worker_a(all_tools))
register_tools_for_agent(workers[worker_models.WorkerRole.worker_b.value], get_tools_for_worker_b(all_tools))
# Register tools for other workers
return workers
# Create overseer worker
def create_overseer_worker(all_tools: List[Callable]):
llm_config = {
"config_list": [
{"model": LLMModel.GPT4_MINI.value, "api_key": settings.openai_api_key},
{"model": LLMModel.GPT4.value, "api_key": settings.openai_api_key},
],
"timeout": 120,
"seed": random.randint(1, 1000),
"temperature": 0.1,
"max_tokens": 10000,
}
user_proxy = UserProxyAgent(
name=worker_models.WorkerRole.overseer.value,
system_message="You are the overseer. Guide and manage the team efficiently.",
code_execution_config=False,
human_input_mode="NEVER",
llm_config=llm_config,
)
register_tools_for_user_proxy(user_proxy, all_tools)
return user_proxy
# Worker suite creation
def create_worker_suite(
tenant_id: str,
context,
service_a: ServiceA,
service_b: ServiceB,
service_c: ServiceC,
specific_context: str = None,
):
all_tools = initialize_all_tools(tenant_id, context, service_a, service_b, service_c)
workers = get_worker_definitions(all_tools, specific_context)
workers[worker_models.WorkerRole.overseer.value] = create_overseer_worker(all_tools)
context_handlers = setup_context_handlers()
for name, worker in workers.items():
handler = context_handlers["medium"]
if name == worker_models.WorkerRole.organiser.value:
handler = context_handlers["large"]
handler.add_to_agent(worker)
return workers
# Context handlers setup
def setup_context_handlers():
return {
"medium": transform_messages.TransformMessages(
transforms=[
transforms.MessageTokenLimiter(
max_tokens=16000,
max_tokens_per_message=6000,
model=LLMModel.GPT4_MINI.value,
),
]
),
"large": transform_messages.TransformMessages(
transforms=[
transforms.MessageTokenLimiter(
max_tokens=16000,
max_tokens_per_message=6000,
model=LLMModel.GPT4_MINI.value,
),
]
),
}
# Group chat initialization
def initialise_group_chat_manager(context, workers, messages=[], max_round=400, introductions=False) -> GroupChatManager:
group_chat = GroupChat(
agents=list(workers.values()),
messages=messages,
max_round=max_round,
speaker_selection_method="auto",
send_introductions=introductions,
speaker_transitions_type="allowed",
select_speaker_transform_messages=setup_context_handlers()["medium"],
)
context.set_group_chat(group_chat)
manager = GroupChatManager(
groupchat=group_chat,
llm_config=get_llm_config(),
is_termination_msg=lambda x: "GROUPCHAT_TERMINATE" in x.get("content", ""),
)
return manager
# Main GroupChatService
class GroupChatService:
def __init__(
self,
settings: Annotated[Settings, Depends(get_settings)],
tenant_id: str,
service_a: Annotated[ServiceA, Depends(ServiceA)],
service_b: Annotated[ServiceB, Depends(ServiceB)],
service_c: Annotated[ServiceC, Depends(ServiceC)],
):
self.settings = settings
self.tenant_id = tenant_id
self.service_a = service_a
self.service_b = service_b
self.service_c = service_c
async def process_task(self, context: str, input_data: List[str], background_tasks: BackgroundTasks):
context_manager = ContextManager()
logging_service = LoggingService(self.settings)
log_file_path = f"logs/{uuid.uuid4()}.db"
session_id = str(uuid.uuid4())
context_manager.set_session_id(session_id)
logger.info(f"Initialising session ID: {session_id}")
logger.info(f"Processing input data: {input_data}")
context_manager.set_context(context)
context_manager.set_input_data(input_data)
workers = create_worker_suite(
self.tenant_id,
context_manager,
self.service_a,
self.service_b,
self.service_c,
specific_context=context,
)
overseer = workers[worker_models.WorkerRole.overseer.value]
manager = self.initialise_group_chat_manager(
context=context_manager,
workers=workers,
introductions=True,
)
# Example of starting the chat
initial_message = f"Let's start processing the task for {context} with input data: {', '.join(input_data)}"
await overseer.a_initiate_chat(manager, message=initial_message)
# Process the chat results and return
# This is where you'd implement the logic to process the chat results
results = "Processed results would go here"
return results
# Example usage
async def main():
settings = get_settings()
service_a = ServiceA()
service_b = ServiceB()
service_c = ServiceC()
group_chat_service = GroupChatService(
settings,
"example_tenant",
service_a,
service_b,
service_c
)
background_tasks = BackgroundTasks()
context = "example_context"
input_data = ["data1", "data2"]
results = await group_chat_service.process_task(context, input_data, background_tasks)
print(f"Task results: {results}")
if __name__ == "__main__":
asyncio.run(main()
Can you put a break point on:
https://github.com/microsoft/autogen/blob/11ef58b98e1bcb6567a8f8b87e70540123782c5e/autogen/agentchat/groupchat.py#L1255
To see what are:
- the current message history in group chat (
groupchat.messages) - the selected speaker -- it should be user proxy if the latest message is a tool call.
Here is the print out of groupchat.messages at the point of failure 400-groupchat-error-autogen.json
I can also confirm that it was user proxy that was selected as the speaker before it errors out.
Here is the error so you can see the call ids it refers to
opspilot.src.group_chat.service: Error code: 400 - {'error': {'message': "An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'. The following tool_call_ids did not have response messages: call_l83lHmo8VPd67TaA2a0Ixcls, call_KVx5hNEWNfzaiaxUTyDMx6Gj, call_2x7gDGWU2jWNR3WIZmu7XnsL, call_wfhMdNQDYF6cIlAVnxS39MNV, call_VwzpfXa47RIzEWdJNHALZjTO, call_owezg1Oi8LRmiv8ajMslKf41, call_a6kxV4wVo11FKB0iJdWwEKvp, call_Nqw4aKWK3B0thWlDyLTvYx7x, call_tMrlPmtxX7mwWXQ89eVNYtAm, call_okSj3b9AscvAtHtPAhuZqLuU, call_MtVAIDxAKLMJPrs4hm8PPsAu, call_eSV32SFtUuwy7cTyZ4IFxvdJ, call_tW7RBhYQmVLGPfPMXAmXcETz, call_IMUyDSz7w0eaeneo8xTAa36H, call_Iyyre8yBmLi8Wv1D7wn47l3i, call_AH41oIZF9yNsSs7VpIFJPS2M", 'type': 'invalid_request_error', 'param': 'messages.[26].role', 'code': None}}
any update on this? its going to start becoming a blocker for me soon
@Nathan-Intergral I took a look at your log output -- it is a massive file.
Just a quick check on the call ID call_2x7gDGWU2jWNR3WIZmu7XnsL in your log output. it exists in both call and result:
Tool Call (third to the last message):
{'id': 'call_2x7gDGWU2jWNR3WIZmu7XnsL', 'function': {'arguments': '{
"query": "histogram_quantile(0.99, sum(rate(ml_anomaly_probability[5m])) by (service))",
"is_overview": false
}', 'name': 'create_query_tool'
}, 'type': 'function'
}
Tool Call Result (last message)
{'tool_call_id': 'call_2x7gDGWU2jWNR3WIZmu7XnsL', 'role': 'tool', 'content': "Unable to create query histogram_quantile(0.99, sum(rate(ml_anomaly_probability[5m])) by (service)): The query does not filter by the service 'opspilot-agent' and is not marked as an overview query. If you intended this to be an overview query, please mark it as such and try again. Otherwise, please adjust the query to filter specifically for the 'opspilot-agent' service and resubmit.\n\nThe query has not been created, please attempt to create the query once you have made adjustments instead of attempting to update it."}
So, third to the last message should be followed by the last message, or content of it, but for some reason the second to the last message is also a tool call message.
Can you debug through this code block and see exactly what happened to cause two consecutive tool call messages? The correct case should be that a tool call message is followed by a tool call result message with matching call IDs.
https://github.com/microsoft/autogen/blob/11ef58b98e1bcb6567a8f8b87e70540123782c5e/autogen/agentchat/groupchat.py#L493-L507
Hey sorry for the delay, here is the debugging as requested:
i added prints to the code like this:
if (
self.func_call_filter
and self.messages
and ("function_call" in self.messages[-1] or "tool_calls" in self.messages[-1])
):
print(f"\n[DEBUG] Function call detected in message {len(self.messages)}")
# Debug tool call IDs and matching
if "tool_calls" in self.messages[-1]:
current_tool_ids = [tool["id"] for tool in self.messages[-1]["tool_calls"] if tool["type"] == "function"]
print(f"[DEBUG] Current tool call IDs: {current_tool_ids}")
# Check previous message for tool_responses
if len(self.messages) > 1 and "tool_responses" in self.messages[-2]:
prev_response_ids = [resp.get("id") for resp in self.messages[-2]["tool_responses"]]
print(f"[DEBUG] Previous response IDs: {prev_response_ids}")
print(f"[DEBUG] ID match check: {set(current_tool_ids) & set(prev_response_ids)}")
funcs = []
if "function_call" in self.messages[-1]:
funcs += [self.messages[-1]["function_call"]["name"]]
if "tool_calls" in self.messages[-1]:
funcs += [
tool["function"]["name"] for tool in self.messages[-1]["tool_calls"] if tool["type"] == "function"
]
print(f"[DEBUG] Functions to execute: {funcs}")
agents = [agent for agent in self.agents if agent.can_execute_function(funcs)]
print(f"[DEBUG] Agents that can execute the function: {[agent.name for agent in agents]}")
if len(agents) == 1:
print(f"[DEBUG] Single agent found: {agents[0].name}")
return agents[0], agents, None
elif not agents:
print("[DEBUG] No agents found with direct function match, checking for any agents with function_map")
agents = [agent for agent in self.agents if agent.function_map]
print(f"[DEBUG] Agents with function_map: {[agent.name for agent in agents]}")
if len(agents) == 1:
print(f"[DEBUG] Single agent with function_map found: {agents[0].name}")
return agents[0], agents, None
elif not agents:
raise ValueError(
f"No agent can execute the function {', '.join(funcs)}. "
"Please check the function_map of the agents."
and it returned this when it errored:
[DEBUG] Function call detected in message 56
[DEBUG] Current tool call IDs: ['call_Hvxm8YKxM8yqaVM7PJOEtzyx', 'call_0zXyJfG2K2js2cpoKs44qapH', 'call_OoE3bCX2Rl6Z4fyf8rqud8FP', 'call_Ur9XOWLRXdINhiNHusuQXsD0', 'call_ArrGcyxV6edV3bX5txtjx2AV', 'call_6isBdgpXFt6Usj1JkldFCwyK', 'call_egU1Ra46pa4y6nvm6L0jLcuN', 'call_51S4XUOa4K1z1llZnhJ6ooAa', 'call_ZvOKsAyW4SOBoGw1SC8XYEjR', 'call_xhq3aO0QQRxAgUWCDtwtjZ58', 'call_KOQkvMDUwG0Rh4ZSSJtnzSGA', 'call_zcZ3h3LXNAni5KKS5GgVesZf', 'call_36UPY6VmcqmoX78csMQafU5C', 'call_56dfccR1VWBXUDDbpoUVhHQ3', 'call_IQHMZ0JLxSoJvZJIfXM7c2Wy', 'call_fwTfvRCddV62KAv7PP9NIOBQ', 'call_Apybff5jwzjyHNqpvLGZQrUx', 'call_AmPQgX73uRS4GUn735QwqchJ']
[DEBUG] Previous response IDs: [None, None, None]
[DEBUG] ID match check: set()
[DEBUG] Functions to execute: ['create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool']
[DEBUG] Agents that can execute the function: ['overseer']
[DEBUG] Single agent found: overseer
so it looks like new tool calls are being processed before previous ones complete somehow?
i also then added these print statements to see the unresolved tool calls:
if (
self.func_call_filter
and self.messages
and ("function_call" in self.messages[-1] or "tool_calls" in self.messages[-1])
):
print(f"\n[DEBUG] Function call detected in message {len(self.messages)}")
# Check for unresolved tool calls in previous message
if len(self.messages) > 1 and "tool_calls" in self.messages[-2]:
prev_tool_ids = set(tool["id"] for tool in self.messages[-2]["tool_calls"] if tool["type"] == "function")
prev_responses = set()
if "tool_responses" in self.messages[-1]:
prev_responses = set(resp.get("id") for resp in self.messages[-1]["tool_responses"])
unresolved_calls = prev_tool_ids - prev_responses
if unresolved_calls:
print(f"[DEBUG] Found unresolved tool calls: {unresolved_calls}")
raise ValueError(
f"Previous tool calls are still unresolved. Must handle responses for: {unresolved_calls}"
)
# Process current tool calls
if "tool_calls" in self.messages[-1]:
current_tool_ids = [tool["id"] for tool in self.messages[-1]["tool_calls"] if tool["type"] == "function"]
print(f"[DEBUG] Current tool call IDs: {current_tool_ids}")
funcs = []
if "function_call" in self.messages[-1]:
funcs += [self.messages[-1]["function_call"]["name"]]
if "tool_calls" in self.messages[-1]:
funcs += [
tool["function"]["name"] for tool in self.messages[-1]["tool_calls"] if tool["type"] == "function"
]
print(f"[DEBUG] Functions to execute: {funcs}")
agents = [agent for agent in self.agents if agent.can_execute_function(funcs)]
print(f"[DEBUG] Agents that can execute the function: {[agent.name for agent in agents]}")
if len(agents) == 1:
print(f"[DEBUG] Single agent found: {agents[0].name}")
return agents[0], agents, None
elif not agents:
print("[DEBUG] No agents found with direct function match, checking for any agents with function_map")
agents = [agent for agent in self.agents if agent.function_map]
print(f"[DEBUG] Agents with function_map: {[agent.name for agent in agents]}")
if len(agents) == 1:
print(f"[DEBUG] Single agent with function_map found: {agents[0].name}")
return agents[0], agents, None
elif not agents:
raise ValueError(
f"No agent can execute the function {', '.join(funcs)}. "
"Please check the function_map of the agents."
and got:
[DEBUG] Function call detected in message 3
[DEBUG] Found unresolved tool calls: {'call_dLpVzExklG5s6PjdTsALEnoC', 'call_92m58pewL50eQvQJ6RwxCIvp', 'call_pd9TjwyGVT0PoAGgyd89l24h', 'call_xhhJki2GjJJUmqMC5b6RQG9z', 'call_RCNHnnALNIFSmwmdFap9sr1b', 'call_IpDyJ4AhpvI8IznhE8JDIj1I', 'call_I0EN0wI46Wc9CNlrQjvnlvoj', 'call_cvpuq534xXebqkLz3Szturnv'}
ERROR: opspilot.src.group_chat.service: Previous tool calls are still unresolved. Must handle responses for: {'call_dLpVzExklG5s6PjdTsALEnoC', 'call_92m58pewL50eQvQJ6RwxCIvp', 'call_pd9TjwyGVT0PoAGgyd89l24h', 'call_xhhJki2GjJJUmqMC5b6RQG9z', 'call_RCNHnnALNIFSmwmdFap9sr1b', 'call_IpDyJ4AhpvI8IznhE8JDIj1I', 'call_I0EN0wI46Wc9CNlrQjvnlvoj', 'call_cvpuq534xXebqkLz3Szturnv'}
Traceback (most recent call last):
File "/home/nathan/Documents/repo/opspilot-query-auto-agent/src/group_chat/service.py", line 120, in process_metrics_callback
await overseer.a_initiate_chat(manager, message=plan)
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 1180, in a_initiate_chat
await self.a_send(msg2send, recipient, silent=silent)
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 800, in a_send
await recipient.a_receive(message, self, request_reply, silent)
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 951, in a_receive
reply = await self.a_generate_reply(sender=sender)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 2138, in a_generate_reply
final, reply = await reply_func(
^^^^^^^^^^^^^^^^^
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/autogen/agentchat/groupchat.py", line 1243, in a_run_chat
speaker = await groupchat.a_select_speaker(speaker, self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/autogen/agentchat/groupchat.py", line 592, in a_select_speaker
selected_agent, agents, messages = self._prepare_and_select_agents(last_speaker)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/nathan/miniconda3/envs/opspilot-query-auto-agent/lib/python3.11/site-packages/autogen/agentchat/groupchat.py", line 493, in _prepare_and_select_agents
raise ValueError(
ValueError: Previous tool calls are still unresolved. Must handle responses for: {'call_dLpVzExklG5s6PjdTsALEnoC', 'call_92m58pewL50eQvQJ6RwxCIvp', 'call_pd9TjwyGVT0PoAGgyd89l24h', 'call_xhhJki2GjJJUmqMC5b6RQG9z', 'call_RCNHnnALNIFSmwmdFap9sr1b', 'call_IpDyJ4AhpvI8IznhE8JDIj1I', 'call_I0EN0wI46Wc9CNlrQjvnlvoj', 'call_cvpuq534xXebqkLz3Szturnv'}
Let me know if you need any more. Thanks for your help.
I got it now. I your code:
def create_overseer_worker(all_tools: List[Callable]):
llm_config = {
"config_list": [
{"model": LLMModel.GPT4_MINI.value, "api_key": settings.openai_api_key},
{"model": LLMModel.GPT4.value, "api_key": settings.openai_api_key},
],
"timeout": 120,
"seed": random.randint(1, 1000),
"temperature": 0.1,
"max_tokens": 10000,
}
user_proxy = UserProxyAgent(
name=worker_models.WorkerRole.overseer.value,
system_message="You are the overseer. Guide and manage the team efficiently.",
code_execution_config=False,
human_input_mode="NEVER",
llm_config=llm_config,
)
register_tools_for_user_proxy(user_proxy, all_tools)
return user_proxy
You added the llm_config to the user proxy, which is responsible for both generating LLM reply as well as function execution. When the group chat selects the user proxy to execute function, it actually triggered the LLM response. So you see another LLM generated tool call. This is when the 400 response happens.
The fix here is to separate the user proxy into two agents, one for LLM response, one for tool execution.
@ekzhu ah nice work! i'll implement the change asap and let you know the results.
@ekzhu Okay Ive given that a go and i seem to be getting the same error. I want to check with you that Ive done it correctly:
def create_overseer_worker():
llm_config = {
"config_list": [
{"model": LLMModel.GPT4_MINI.value, "api_key": settings.openai_api_key},
{"model": LLMModel.GPT4.value, "api_key": settings.openai_api_key},
],
"timeout": 120,
"seed": random.randint(1, 1000),
"temperature": 0.1,
"max_tokens": 10000,
}
user_proxy = UserProxyAgent(
name=worker_models.WorkerRole.overseer.value,
system_message=(
""
),
code_execution_config=False,
human_input_mode="NEVER",
llm_config=llm_config,
)
return user_proxy
def create_tool_executor(all_tools: List[Callable]):
user_proxy = UserProxyAgent(
name=worker_models.WorkerRole.tool_executor.value,
code_execution_config=False,
human_input_mode="NEVER",
llm_config=False,
)
register_tools_for_user_proxy(user_proxy, all_tools)
return user_proxy
then initiating still with the overseer:
overseer = workers[worker_models.WorkerRole.overseer.value]
await overseer.a_initiate_chat(manager, message=request)
Can you find out which agent generated the tool call response message that did not address all the tool calls?
Is the order of messages still:
- tool calls
- tool call response
?
If you can just come up with a minimal, executable code snippet that can reproduce the error would be good.
okay i added these debug logs
if (
self.func_call_filter
and self.messages
and ("function_call" in self.messages[-1] or "tool_calls" in self.messages[-1])
):
funcs = []
last_msg = self.messages[-1]
prev_msg = self.messages[-2] if len(self.messages) > 1 else None
print(f"\n[DEBUG] Message #{len(self.messages)}")
print(f"[DEBUG] Last message from: {last_msg.get('name', 'unknown')}")
print(f"[DEBUG] Message content: {last_msg.get('content', '')}")
# Track tool calls and their IDs
if "tool_calls" in last_msg:
current_tool_calls = [
tool for tool in last_msg["tool_calls"] if tool["type"] == "function"
]
current_ids = [tool["id"] for tool in current_tool_calls]
print(f"[DEBUG] Current tool call IDs: {current_ids}")
# Check previous message for tool responses
if prev_msg:
print(f"[DEBUG] Previous message from: {prev_msg.get('name', 'unknown')}")
print(f"[DEBUG] Previous message role: {prev_msg.get('role', 'unknown')}")
if prev_msg.get("role") == "tool" and "tool_call_id" in prev_msg:
print(f"[DEBUG] Previous message tool_call_id: {prev_msg['tool_call_id']}")
elif prev_msg.get("role") == "assistant" and "tool_calls" in prev_msg:
print("[DEBUG] WARNING: Previous message also had tool calls!")
prev_ids = [tool["id"] for tool in prev_msg["tool_calls"] if tool["type"] == "function"]
print(f"[DEBUG] Previous tool call IDs: {prev_ids}")
# Add function names to funcs list
tool_calls = [
tool["function"]["name"] for tool in current_tool_calls
]
funcs += tool_calls
print(f"[DEBUG] Found tool_calls: {tool_calls}")
if "function_call" in last_msg:
funcs += [last_msg["function_call"]["name"]]
print(f"[DEBUG] Found function_call: {last_msg['function_call']['name']}")
# find agents with the right function_map
agents = [agent for agent in self.agents if agent.can_execute_function(funcs)]
print(f"[DEBUG] Agents that can execute functions {funcs}: {[a.name for a in agents]}")
if len(agents) == 1:
print(f"[DEBUG] Selected single capable agent: {agents[0].name}")
return agents[0], agents, None
elif not agents:
agents = [agent for agent in self.agents if agent.function_map]
print(f"[DEBUG] No direct function matches. Agents with any function_map: {[a.name for a in agents]}")
if len(agents) == 1:
print(f"[DEBUG] Selected only agent with function_map: {agents[0].name}")
return agents[0], agents, None
elif not agents:
raise ValueError(
f"No agent can execute the function {', '.join(funcs)}. "
"Please check the function_map of the agents."
)
and got this:
[DEBUG] Message #10
[DEBUG] Last message from: query_generator
[DEBUG] Message content:
[DEBUG] Current tool call IDs: ['call_zni63Dv6p6bxFn96IE2NbGTv', 'call_4IUdHcPD1Ek5ly2SbJqCDnyg', 'call_1Q71inspDmDBbStpBefBZP9W', 'call_Ofiaid5Xii31vx5KpoXVDKZj', 'call_APyoGUWiBW2GSk8eYJMYHqIs', 'call_64OA8fmxqf1L6Ys3MyQJzbPv', 'call_Q7bDePFM16dt2DanV1Z0GXws', 'call_6UKba8n1UQiTyKUPF8RxCVz1', 'call_7zMqOgcJlW5FTyCyUMFlmh57', 'call_vu6gyDlYAse0nyxaypaBTCbr', 'call_y77Q1xi7Dba5Z0T1KYlDSqfI', 'call_gxQtnLtHlltrxfXpvYZdx3Kt', 'call_4jKutjt4Hmy1CtHb7KCs7FHa', 'call_nmYpo05amtnTYLFiXU1olTI4']
[DEBUG] Previous message from: query_generator
[DEBUG] Previous message role: assistant
[DEBUG] WARNING: Previous message also had tool calls!
[DEBUG] Previous tool call IDs: ['call_hDvVz5HQzTNoWpqgzrQIuT6x', 'call_d5LIvWkFcCzyYQ3vkoH7GXEY', 'call_0NeAgcw445gmu0XBNf7yGfqb', 'call_EcX4Tj7FK7CakhW03KLHXBkv', 'call_eiOo8VwGEyTKMIqAZrbuLRoJ', 'call_9pWv47YxpOY1il1kd7E2GDJI', 'call_rs5m7Tw7HcXehWi6MBBi45Wj', 'call_JV8po8cEQybTgVukyjkqkwlu', 'call_0BPTpH9OkyLfhuB6L0OPpTg6', 'call_25iqGMwAHjkkIc3HCLXLEHOQ', 'call_4vPOKp6fy9NFHAyCHBT34Dpl']
[DEBUG] Found tool_calls: ['create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool']
[DEBUG] Agents that can execute functions ['create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool', 'create_query_tool']: ['tool_executor']
[DEBUG] Selected single capable agent: tool_executor
ERROR: opspilot.src.group_chat.service: Error code: 400 - {'error': {'message': "An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'. The following tool_call_ids did not have response messages: call_gmJpFJwcIaTWoa0W9dS6NVWV, call_hKfMNTAZgRWl28FLWWaGbhHG, call_ssePEHSvO65y6WFcUcwT1dNP, call_LUnVyP1MjOLpYMwcvmMqCEbs, call_PKzR2TFXztZfdzZi3HrRD7tC, call_MdszLxb09l2Nc3RjwqMBOzpX, call_80p6H4KV0DkZufm4U6p4Q0iX, call_oFETIHb10i4IF7Fr0pCSHrKw, call_EYL9cFiZT6YRVdJQ55IYNq4g, call_UTJqHOwFwbLkXTFjaD8SRslN", 'type': 'invalid_request_error', 'param': 'messages.[25].role', 'code': None}}
ill work on a minimal version that replicates the issue
[DEBUG] Last message from: query_generator
[DEBUG] Message content:
[DEBUG] Current tool call IDs: ['call_zni63Dv6p6bxFn96IE2NbGTv', 'call_4IUdHcPD1Ek5ly2SbJqCDnyg', 'call_1Q71inspDmDBbStpBefBZP9W', 'call_Ofiaid5Xii31vx5KpoXVDKZj', 'call_APyoGUWiBW2GSk8eYJMYHqIs', 'call_64OA8fmxqf1L6Ys3MyQJzbPv', 'call_Q7bDePFM16dt2DanV1Z0GXws', 'call_6UKba8n1UQiTyKUPF8RxCVz1', 'call_7zMqOgcJlW5FTyCyUMFlmh57', 'call_vu6gyDlYAse0nyxaypaBTCbr', 'call_y77Q1xi7Dba5Z0T1KYlDSqfI', 'call_gxQtnLtHlltrxfXpvYZdx3Kt', 'call_4jKutjt4Hmy1CtHb7KCs7FHa', 'call_nmYpo05amtnTYLFiXU1olTI4']
[DEBUG] Previous message from: query_generator
[DEBUG] Previous message role: assistant
[DEBUG] WARNING: Previous message also had tool calls!
This is surprising. The same agent is selected twice to generate tool calls. How come the first time the tool agent was not selected?
okay ive some further notes on this, ive now seen this when the error occured, the label speciallist sends three messages in a row to make tool calls before any are even executed with the tool executor.
[Message #2]
Role: assistant
Name: label_specialist
Tool Calls:
- ID: call_vS9ueFjBKq9NzG9Q6A4nzPEr
Function: label_tool
Arguments: {"metric": "opspilot_gpt_tokens_total"}
- ID: call_JnWW4VJDw5VVJBhyxHiFS3OP
Function: label_tool
Arguments: {"metric": "system_disk_operations_total"}
- ID: call_nhpsMthsLL5kdpJpTvVJ0Xgd
Function: label_tool
Arguments: {"metric": "opspilot_opspilot_tokens_total"}
Content:
--------------------------------------------------
[Message #3]
Role: assistant
Name: label_specialist
Tool Calls:
- ID: call_uoQj8REfFYW9IJmNRKrNlUAJ
Function: label_tool
Arguments: {"metric": "ALERTS_FOR_STATE"}
- ID: call_37CS9yPJjLtLLMcusj9JnJiN
Function: label_tool
Arguments: {"metric": "ALERTS"}
- ID: call_6bUmvf2ICD0rcOJbhPnf3Xbj
Function: label_tool
Arguments: {"metric": "opspilot_requests_total"}
- ID: call_kEsJyUoRvvQwuL2nO3aLhKkB
Function: label_tool
Arguments: {"metric": "target_info"}
Content:
--------------------------------------------------
[Message #4]
Role: assistant
Name: label_specialist
Tool Calls:
- ID: call_vR3nhllXIOz5ZId15MFeD687
Function: label_tool
Arguments: {"metric": "traces_spanmetrics_latency_sum"}
- ID: call_W8txprbLeHbl91RqwOaHKIqv
Function: label_tool
Arguments: {"metric": "traces_spanmetrics_latency_bucket"}
- ID: call_k94ffgrykbHeSs2UUpjZuaPb
Function: label_tool
Arguments: {"metric": "traces_spanmetrics_size_total"}
- ID: call_KBv406gsKdtHOjTVQM357YXl
Function: label_tool
Arguments: {"metric": "traces_spanmetrics_latency_count"}
- ID: call_7IK2zLJTKjOT620P9oVnEDgb
Function: label_tool
Arguments: {"metric": "traces_spanmetrics_calls_total"}
Could this be due to the tools not registered with the tool agent? The select speaker function in the group chat should select the agent that has relevant tools.
@ekzhu just check and no the tool agent has the tools registered correctly
@Nathan-Intergral did you solve this. Some folks on the discord server faced the same situation and it might be an openai issue. This has been shared on the autogen discord channel: https://community.openai.com/t/chatgpt-occasionally-reuses-tool-ids-in-the-same-session/577207
Can you test your setup, temporarily using another model like Claude Sonnet?
Btw I'm having the same exact issue. Completely not deterministic, tools registered correctly, using different unique calling and executor agents. At this point I'm seriously wondering if it's an openai issue