agents icon indicating copy to clipboard operation
agents copied to clipboard

Google Realtime API fails with MCP tools due to $schema field in parameters

Open muhammadkhalid-03 opened this issue 1 month ago • 2 comments

Bug Description

The to_fnc_ctx function in livekit/plugins/google/utils.py only uses _GeminiJsonSchema.simplify() for regular FunctionTool definitions, but not for RawFunctionTool definitions (which MCP tools use).

For context, I have an MCP endpoint in a separate API service. This API service has some MCP tools defined which use Zod schemas. When these tools are served via MCP, the Zod schemas are automatically converted to JSON Schema by the MCP SDK. When my LiveKit agent (self-hosted) connects to the MCP server and forwards these tools to Google's Realtime API, Gemini rejects them due to an unsupported $schema field in the raw tool definition that isn't being removed before being passed onto Gemini due to which the agent fails to speak.

Current MCP tool schema:

{
  'type': 'object',
  'properties': {...},
  'additionalProperties': False,
  '$schema': 'http://json-schema.org/draft-07/schema#'
}

Expected Behavior

Both FunctionTool and RawFunctionTool should have their schemas cleaned up to remove unsupported fields when passing them onto Gemini. For example:

{
  'type': 'object',
  'properties': {...},
  'additionalProperties': False,
  '$schema': 'http://json-schema.org/draft-07/schema#'              <- This shouldn't be here
}

Reproduction Steps

1. Start a livekit session with a google realtime model (e.g. `google.realtime.RealtimeModel(model="gemini-2.5-flash-native-audio-preview-12-2025", voice="Kore")`) that has access to an MCP tool and that uses `RawFunctionTool`
2. The agent should fail fast without you being able to talk to it.
3.
...
- Sample code snippet, or a GitHub Gist link -

Operating System

macOS Sequoia 15.6.1

Models Used

gemini-2.5-flash-native-audio-preview-12-2025

Package Versions

livekit-agents=1.3.6

Session/Room/Call IDs

No response

Proposed Solution

Apply `_GeminiJsonSchema.simplify()` to `RawFunctionTool` parameters in the `else` branch of `to_fnc_ctx()`:

else:
    json_schema = _GeminiJsonSchema(info.raw_schema.get("parameters", {})).simplify()
    fnc_kwargs["parameters"] = types.Schema.model_validate(json_schema)

Additional Context

No response

Screenshots and Recordings

Image Image

muhammadkhalid-03 avatar Dec 19 '25 10:12 muhammadkhalid-03

I am using this monkey patch to solve the issue until its properly fixed:

"""Monkey patch for Google plugin to handle MCP tool schemas with $schema field.

This patch addresses the issue where MCP tools generated with Zod/TypeScript include
a `$schema` field in their parameter schemas, which causes a Pydantic validation error
when used with Gemini Realtime API.

Based on: https://github.com/livekit/agents/issues/2678
          https://github.com/livekit/agents/issues/2462

The error:
    pydantic_core._pydantic_core.ValidationError: 1 validation error for JSONSchema
    $schema
      Extra inputs are not permitted [type=extra_forbidden,
      input_value='http://json-schema.org/draft-07/schema#', input_type=str]

The patch:
1. Intercepts the `to_fnc_ctx` function in `livekit.plugins.google.utils`
2. Strips the `$schema` field from MCP tool parameters before validation
3. Allows MCP tools to work correctly with Gemini Realtime API
"""

import logging
from copy import deepcopy
from typing import Any

import livekit.plugins.google.utils as google_utils
from google.genai import types
from livekit.agents.llm.tool_context import (
    FunctionTool,
    RawFunctionTool,
    get_raw_function_info,
    is_function_tool,
    is_raw_function_tool,
)
from livekit.agents.types import NOT_GIVEN, NotGivenOr
from livekit.agents.utils import is_given

logger = logging.getLogger(__name__)

# Store original function
_original_to_fnc_ctx = google_utils.to_fnc_ctx


def _strip_schema_field(params: dict[str, Any]) -> dict[str, Any]:
    """Recursively strip $schema fields from a JSON schema dict.

    Also strips other common fields that cause validation issues with Gemini's JSONSchema.
    """
    if not isinstance(params, dict):
        return params

    # Create a shallow copy to avoid modifying the original
    result = dict(params)

    # Remove problematic top-level fields
    fields_to_remove = [
        "$schema",
        "$id",
        "$ref",
        "$defs",
        "additionalProperties",
        "title",
        "default",
    ]
    for field in fields_to_remove:
        result.pop(field, None)

    # Recursively process nested structures
    if "properties" in result and isinstance(result["properties"], dict):
        result["properties"] = {
            k: _strip_schema_field(v) for k, v in result["properties"].items()
        }

    if "items" in result and isinstance(result["items"], dict):
        result["items"] = _strip_schema_field(result["items"])

    if "anyOf" in result and isinstance(result["anyOf"], list):
        result["anyOf"] = [_strip_schema_field(item) for item in result["anyOf"]]

    if "allOf" in result and isinstance(result["allOf"], list):
        result["allOf"] = [_strip_schema_field(item) for item in result["allOf"]]

    if "oneOf" in result and isinstance(result["oneOf"], list):
        result["oneOf"] = [_strip_schema_field(item) for item in result["oneOf"]]

    return result


def _patched_to_fnc_ctx(
    fncs: list[FunctionTool | RawFunctionTool],
    *,
    use_parameters_json_schema: bool = True,
    tool_behavior: NotGivenOr[types.Behavior] = NOT_GIVEN,
) -> list[types.FunctionDeclaration]:
    """Patched to_fnc_ctx that strips $schema from MCP tool parameters.

    This fixes the issue where MCP tools generated with Zod/TypeScript include
    a `$schema` field that Gemini's JSONSchema Pydantic model doesn't accept.
    """
    from livekit.plugins.google.utils import _build_gemini_fnc

    tools: list[types.FunctionDeclaration] = []
    for fnc in fncs:
        if is_raw_function_tool(fnc):
            info = get_raw_function_info(fnc)
            fnc_kwargs = {
                "name": info.name,
                "description": info.raw_schema.get("description", ""),
            }

            # Get the parameters schema
            raw_params = info.raw_schema.get("parameters", {})

            if use_parameters_json_schema:
                # For non-realtime: use parameters_json_schema (more permissive)
                fnc_kwargs["parameters_json_schema"] = raw_params
            else:
                # For realtime API: need to use types.Schema.from_json_schema
                # which requires strict JSONSchema validation
                # *** THIS IS THE FIX: Strip $schema before validation ***
                cleaned_params = _strip_schema_field(deepcopy(raw_params))

                if cleaned_params:
                    try:
                        json_schema = types.JSONSchema.model_validate(cleaned_params)
                        fnc_kwargs["parameters"] = types.Schema.from_json_schema(
                            json_schema=json_schema
                        )
                    except Exception as e:
                        logger.warning(
                            f"Failed to parse parameters for tool '{info.name}': {e}. "
                            f"Original params: {raw_params}. Cleaned params: {cleaned_params}"
                        )
                        # Skip this tool if we can't parse its parameters
                        continue

            if is_given(tool_behavior):
                fnc_kwargs["behavior"] = tool_behavior
            tools.append(types.FunctionDeclaration(**fnc_kwargs))

        elif is_function_tool(fnc):
            tools.append(_build_gemini_fnc(fnc, tool_behavior=tool_behavior))

    return tools


def apply_google_mcp_schema_patch():
    """Apply the monkey patch to handle MCP tool schemas with $schema field.

    Call this before creating any Google Realtime sessions with MCP tools.

    Note: We must patch both:
    1. livekit.plugins.google.utils.to_fnc_ctx (the source)
    2. livekit.plugins.google.realtime.realtime_api.to_fnc_ctx (the imported reference)

    Because Python's `from X import Y` creates a new reference in the importing module.
    """
    import livekit.plugins.google.realtime.realtime_api as realtime_api

    # Patch the original module
    google_utils.to_fnc_ctx = _patched_to_fnc_ctx

    # Patch the imported reference in realtime_api
    realtime_api.to_fnc_ctx = _patched_to_fnc_ctx

    logger.info(
        "✅ Applied Google MCP schema patch for $schema field handling (utils + realtime_api)"
    )


def remove_google_mcp_schema_patch():
    """Remove the monkey patch and restore original behavior."""
    import livekit.plugins.google.realtime.realtime_api as realtime_api

    google_utils.to_fnc_ctx = _original_to_fnc_ctx
    realtime_api.to_fnc_ctx = _original_to_fnc_ctx

    logger.info("🔄 Removed Google MCP schema patch, restored original behavior")

Just call apply_google_mcp_schema_patch() at the top of your agent

csanz91 avatar Dec 22 '25 15:12 csanz91

we switched to always use parameters_json_schema in https://github.com/livekit/agents/pull/4344 which should fix the issue, it will be included in next release.

longcw avatar Dec 23 '25 01:12 longcw