flink-agents icon indicating copy to clipboard operation
flink-agents copied to clipboard

[Feature] Support making use of the native capabilities of models for structure output.

Open wenjin272 opened this issue 3 months ago • 4 comments

Search before asking

  • [x] I searched in the issues and found nothing similar.

Description

Currently, when configured output_schema, the react agent directly prompt the model to use a specific format, and use an output parser to extract the structured response from the raw model output. This is the only way for models which don't support tool calling or json mode.

For models support tool calling or json mode, we can makes use of these capabilities to generate structured output.

See https://python.langchain.com/docs/how_to/structured_output/ for details.

Are you willing to submit a PR?

  • [ ] I'm willing to submit a PR!

wenjin272 avatar Oct 20 '25 09:10 wenjin272

@wenjin272 Hello, I have some questions about this requirement. Question: How to handle the capability differences between different models?

Model Tool Calling JSON Mode Structured Output API

模型 Tool Calling JSON Mode Structured Output API
OpenAI GPT-4 ✅ (response_format)
Anthropic Claude ⚠️ (beta)
Ollama (Qwen/Llama)
Tongyi Qwen

Question:

Is it necessary to implement a separate adaptation for each ChatModel?

Capability detection mechanism (runtime vs. configuration time)?

Degradation strategy (JSON mode failure → tool calling → prompt)?

Besides that, there's another problem. Semantic issues of using Tool Calling for structured output

Question: Is it reasonable to use Tool Calling to implement structured output? Technically feasible: Define a virtual tool named "formatoutput"

{
"name": "format_output",

"description": "Format the final response",

"parameters": OutputData.model_json_schema()

}

However, there are problems:

  • Semantic ambiguity: The tool should "perform an action," not "format output"

  • User experience: This "virtual tool" will appear in the tools list

  • Conflict with real tools: React Agent already has tool invocation capabilities

kitalkuyo-gita avatar Oct 31 '25 09:10 kitalkuyo-gita

You can refer to the structured output doc of langChain:https://docs.langchain.com/oss/python/langchain/structured-output.

Roughly speaking, the performance of structured output can be ranked as follows:

  1. Use the native structured output capabilities provided by the model provider, currently only OpenAI and Grok.
  2. Use the tool calling to achieve the same result.
  3. Use prompt and examples to instruct llm.

Flink-Agents should provide all the strategy.

In addition, these questions seem to be rhetorical ones posed by an AI programming assistant. I hold a positive attitude towards using AI for programming assistance, but developers should to ensure their own understanding of the problems and the quality of the generated code.

wenjin272 avatar Nov 04 '25 13:11 wenjin272

You can refer to the structured output doc of langChain:https://docs.langchain.com/oss/python/langchain/structured-output.

Roughly speaking, the performance of structured output can be ranked as follows:

  1. Use the native structured output capabilities provided by the model provider, currently only OpenAI and Grok.
  2. Use the tool calling to achieve the same result.
  3. Use prompt and examples to instruct llm.

Flink-Agents should provide all the strategy.

Of course, for a new project like yours, an AI programming assistant is a good development partner for understanding the project. For submitted pull requests, I will meet their requirements and ensure they pass testing.

kitalkuyo-gita avatar Nov 05 '25 06:11 kitalkuyo-gita

Here are two documents we can refer to:

  1. https://github.com/langchain-ai/langchain/blob/v0.3/docs/docs/how_to/structured_output.ipynb
  2. https://github.com/langchain-ai/langgraph/blob/main/docs/docs/how-tos/react-agent-structured-output.ipynb

wenjin272 avatar Nov 06 '25 06:11 wenjin272