semantic-kernel icon indicating copy to clipboard operation
semantic-kernel copied to clipboard

.Net: Bug: Function calling fails when Gemini returns function call as non-first part

Open silmon27 opened this issue 9 months ago • 1 comments

Describe the bug When using the Google connector in .NET Semantic Kernel with Gemini models (e.g., gemini-2.0-flash), function calling does not always work as expected. If the model returns a function call as any part other than the first in its response, Semantic Kernel ignores it and defaults to using only the first part. This leads to auto function call behavior not working reliably with models that frequently return multiple parts (e.g., text + function call).

To Reproduce Steps to reproduce the behavior:

  1. Use Semantic Kernel with the Google connector and a recent Gemini model (e.g., gemini-2.0-flash).
  2. Prompt the model in a way that triggers both text and function call responses (very common).
  3. Observe that only the first returned part is handled; if it’s not a function call, invoking the function is skipped.
  4. Auto function call behavior does not work if the function call is in any part other than the first.

Expected behavior Semantic Kernel should handle any function call returned by Gemini, regardless of which part of the response it appears in—not just the first part.

Platform

  • Language: C#
  • Source: 1.47.0-alpha
  • AI model: gemini-2.0-flash

Additional context https://github.com/microsoft/semantic-kernel/blob/main/dotnet/src/Connectors/Connectors.Google/Core/Gemini/Clients/GeminiChatCompletionClient.cs#L607

private GeminiChatMessageContent GetChatMessageContentFromCandidate(GeminiResponse geminiResponse, GeminiResponseCandidate candidate)
{
    GeminiPart? part = candidate.Content?.Parts?[0];
    GeminiPart.FunctionCallPart[]? toolCalls = part?.FunctionCall is { } function ? [function] : null;
    return new GeminiChatMessageContent(
        role: candidate.Content?.Role ?? AuthorRole.Assistant,
        content: part?.Text ?? string.Empty,
        modelId: this._modelId,
        functionsToolCalls: toolCalls,
        metadata: GetResponseMetadata(geminiResponse, candidate));
}

This considers only the first part of the response for function calling. However, Gemini models often return multiple parts (e.g., text + function call). If the function call is not the first part, it is ignored completely.

silmon27 avatar Apr 19 '25 20:04 silmon27

@markwallace-microsoft - I need this feature so I added a proposed fix. Hopefully this is helpful. https://github.com/microsoft/semantic-kernel/pull/11664

blackwire avatar Apr 21 '25 13:04 blackwire