.Net Bug: `InvokeStreamingAsync` duplicates assistant response when tool calls are used

Open tomdai opened this issue 4 months ago • 0 comments

Describe the bug

When streaming agent responses via Agent.InvokeStreamingAsync that include function tool calls, the assistant entry Semantic Kernel adds to chat history concatenates the pre-tool-call text with the post-tool-call reply. The persisted chat message does not match either response emitted by the OpenAI API.

To Reproduce

Steps to reproduce the behavior:

Configure an OpenAIAssistantAgent with a function tool (e.g., get_profiles) and call InvokeStreamingAsync against OpenAI (tested with gpt-5).
Ask a question that causes the assistant to invoke the tool (for example: “what profiles do we have?”).
Let the tool function return successfully and allow the streaming response to finish.
Examine the assistant message that Semantic Kernel stores in chat history; it now contains the initial “I’m going to pull…” text merged with the final answer instead of keeping them separate.

Expected behavior

Each assistant message recorded in chat history should mirror the model responses. The pre-tool-call streaming chunk should appear once, and the final answer should stand on its own without the earlier text prefixed.

Platform

Language: C#
Source: NuGet package Microsoft.SemanticKernel.Agents.OpenAI 1.65.0-preview
AI model: gpt-5
IDE: VS Code
OS: Linux container (Debian-based)

Additional context

OpenAI streams two distinct assistant messages:

{
  "role": "assistant",
  "content": "I’m going to pull the current list of profiles configured in this workspace so I can give you an accurate, up-to-date answer.",
  "tool_calls": [...]
}

followed by

{
  "role": "assistant",
  "content": "<p>Here’s what we’ve got right now — ...</p>"
}

After InvokeStreamingAsync completes, Semantic Kernel produces this chat history entry for the second assistant message:

{
  "role": "assistant",
  "content": "I’m going to pull the current list of profiles configured in this workspace so I can give you an accurate, up-to-date answer.<p>Here’s what we’ve got right now — ...</p>"
}

The first sentence from the initial tool-call prompt should not be duplicated in the final assistant entry.

Oct 24 '25 01:10 tomdai