stagehand
stagehand copied to clipboard
Allow multimodal input to commands
Feature request: add image property to extract/act commands:
const price = await stagehand.extract({
instruction: "extract the price for the watch that looks most similar to the image",
image: <base64> | <file path>,
schema: z.object({ ... }),
})