stagehand icon indicating copy to clipboard operation
stagehand copied to clipboard

Bug: `extract` method with schema does not include schema properties in the request

Open ViktorTrojan opened this issue 4 months ago • 2 comments

Before submitting an issue, please:

  • [x] Check the documentation for relevant information
  • [x] Search existing issues to avoid duplicates

Environment Information

Please provide the following information to help us reproduce and resolve your issue:

Stagehand:

  • Language/SDK: TypeScript
  • Stagehand version: latest

AI Provider:

  • Provider: Custom OpenAI (via OpenAI-compatible endpoint)
  • Model: custom_openai_model

Issue Description

When using the extract method and specifying a schema using Zod, the outgoing request to the LLM contains a response_format object with a json_schema, but the schema property within it is nearly empty. It does not include the properties defined in the Zod schema that was passed to the method. It only contains {"$schema": "http://json-schema.org/draft-07/schema#"}.

Steps to Reproduce

  1. Configure Stagehand with a custom OpenAI client (like the one in the reproduction code below).
  2. Call page.extract() with an instruction and a Zod schema.
  3. Inspect the request body sent to the custom OpenAI endpoint.
  4. Notice that the json_schema.schema field is missing the properties defined in the Zod object.

Minimal Reproduction Code

// Your minimal reproduction code here
import { Stagehand } from '@browserbase/stagehand';
import { z as z3 } from 'zod/v3';
import OpenAI from 'openai';

// NOTE: CustomOpenAIClient is not exported, as mentioned in #1043
// This is a simplified version for reproduction.
class CustomOpenAIClient { 
    constructor(config) { this.config = config; }
    // ... implementation details
}

const stagehand = new Stagehand({
  llmClient: new CustomOpenAIClient({
		modelName: "custom_openai_model",
		client: new OpenAI({
			apiKey: "not-needed-for-local-endpoint",
			baseURL: "http://localhost:1235/v1",
		}),
	}),
});

// Steps that reproduce the issue
const result_job_extract = await page.extract({
    instruction: `
      user content here ...
    `,
    schema: z3.object({
      list_of_apartments: z3.array(
        z3.object({
          address: z3.string().describe("the address of the apartment"),
          price: z3.string().describe("the price of the apartment"),
          square_feet: z3.string().describe("the square footage of the apartment"),
        }),
      ),
    })
  })

Error Messages / Log trace

Here is the problematic request body sent to the LLM. The json_schema.schema field is missing the properties from the Zod schema.

{
  "messages": [
    {
      "role": "user",
      "content": "You are extracting content on behalf of a user. If a user asks you to extract a 'list' of information, or 'all' information, YOU MUST EXTRACT ALL OF THE INFORMATION THAT THE USER REQUESTS. You will be given: 1. An instruction 2. A list of DOM elements to extract from. Print the exact text from the DOM elements with all symbols, characters, and endlines as is. Print null or an empty string if no new information is found. If a user is attempting to extract links or URLs, you MUST respond with ONLY the IDs of the link elements. Do not attempt to extract links directly from the text unless absolutely necessary. "
    },
    {
      "role": "user",
      "content": "user content here ..."
    }
  ],
  "temperature": 0.1,
  "top_p": 1,
  "frequency_penalty": 0,
  "presence_penalty": 0,
  "model": "custom_openai_model",
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "Extraction",
      "strict": true,
      "schema": {
        "$schema": "http://json-schema.org/draft-07/schema#"
      }
    }
  },
  "stream": false
}

Screenshots / Videos

N/A

Related Issues

Are there any related issues or PRs?

  • Related to: #1043

ViktorTrojan avatar Sep 01 '25 17:09 ViktorTrojan

Hey, @ViktorTrojan So A workaround for this issue is to manually convert your Zod schema into JSON Schema before passing it into response_format. Stagehand’s OpenAIClient does this automatically, but custom clients currently don’t.

You can fix it by using zod-to-json-schema

import { z } from "zod";
import { zodToJsonSchema } from "zod-to-json-schema";

const schema = z.object({
  topic: z.string(),
  keywords: z.array(z.string()),
});

const response_format = {
  type: "json_schema",
  json_schema: {
    name: "my_schema",
    schema: zodToJsonSchema(schema, "my_schema"), // <-- manual conversion
  },
};

This ensures the schema properties are included correctly even when using a custom LLM client.

also, Stagehand should ensure that schema conversion happens regardless of which LLM client is used, so that custom clients receive the same schema expansion as the default OpenAI client.

themaainuser avatar Sep 16 '25 05:09 themaainuser

here’s a patched CustomOpenAIClient that mirrors the relevant parts of Stagehand’s OpenAIClient

import OpenAI, { ClientOptions } from "openai";
import { zodResponseFormat } from "openai/helpers/zod";
import zodToJsonSchema from "zod-to-json-schema";

export class CustomOpenAIClient {
  private client: OpenAI;
  private modelName: string;

  constructor({
    modelName,
    clientOptions,
  }: {
    modelName: string;
    clientOptions: ClientOptions;
  }) {
    this.modelName = modelName;
    this.client = new OpenAI(clientOptions);
  }

  async createChatCompletion({
    messages,
    schema,
    name = "Extraction",
    temperature = 0.1,
    top_p = 1,
  }: {
    messages: { role: "system" | "user" | "assistant"; content: string }[];
    schema?: any; // Zod schema
    name?: string;
    temperature?: number;
    top_p?: number;
  }) {
    let responseFormat = undefined;

    if (schema) {
      if (this.modelName.startsWith("o1") || this.modelName.startsWith("o3")) {
        const parsedSchema = JSON.stringify(zodToJsonSchema(schema));
        messages.push({
          role: "user",
          content: `Respond in this zod schema format:\n${parsedSchema}\n
          Do not include any other text, markdown, or formatting. Only the JSON object.`,
        });
      } else {
        responseFormat = zodResponseFormat(schema, name);
      }
    }

    const body = {
      model: this.modelName,
      messages,
      temperature,
      top_p,
      stream: false,
      response_format: responseFormat,
    };

    const response = await this.client.chat.completions.create(body);
    return response;
  }
}

themaainuser avatar Sep 16 '25 05:09 themaainuser