langchain icon indicating copy to clipboard operation
langchain copied to clipboard

Supporting structured JSON output for Google AI

Open vkryukov opened this issue 1 year ago • 5 comments

Looks like Google supports structured JSON output by supplying a schema through model configuration:

curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:generateContent?key=$GOOGLE_API_KEY" \
-H 'Content-Type: application/json' \
-d '{
    "contents": [{
      "parts":[
        {"text": "List 5 popular cookie recipes"}
        ]
    }],
    "generationConfig": {
        "response_mime_type": "application/json",
        "response_schema": {
          "type": "ARRAY",
          "items": {
            "type": "OBJECT",
            "properties": {
              "recipe_name": {"type":"STRING"},
            }
          }
        }
    }
}' 2> /dev/null | head

Would you be interested in a PR that adds this ability, similar to ChatOpenAI implementation (which btw should be documented?)

vkryukov avatar Dec 09 '24 16:12 vkryukov

Hi @vkryukov! Yes, I'd love help implementing and supporting Google Gemini support for structured output with a schema declaration. :smile:

brainlid avatar Dec 12 '24 22:12 brainlid

Sounds good, I will try develop a PR to start as a discussion point.

Google's models are improving fast (both gemini-2.0-flash-exp and gemini-exp-1206 are pretty impressive), and they offer massive context windows - so I want to keep testing them. So we need to have a first-class support by LangChain.

vkryukov avatar Dec 12 '24 23:12 vkryukov

That's cool. I've been so underwhelmed and disappointed by Google's past models that I wrote them off for personal use. Glad to hear they are improving!

brainlid avatar Dec 12 '24 23:12 brainlid

Yeah, I can relate to that. I plan to post a comparison (eventually); for my use case (code generation), gemini-exp-1206 is not bad! But of course, Sonnet 3.5 New still seems to be the SOTA.

vkryukov avatar Dec 12 '24 23:12 vkryukov

Seconding that this functionality would be very useful.

It looks like Ash AI (which uses this project) supports a StructuredOutput adapter (https://hexdocs.pm/ash_ai/AshAi.Actions.Prompt.Adapter.StructuredOutput.html) which only supports OpenAI for now, but first-class support for Gemini structured output (https://ai.google.dev/gemini-api/docs/structured-output) in Langchain would likely change that.

jjmilburn avatar Aug 30 '25 10:08 jjmilburn