crewAI icon indicating copy to clipboard operation
crewAI copied to clipboard

[FEATURE] Improved pydantic_output task if fields descriptions are used

Open Saicheg opened this issue 11 months ago • 4 comments

Feature Area

Core functionality

Is your feature request related to a an existing bug? Please link it here.

Currently if you use pydantic_output for task it will include instructions to LLM like this:

Schema:

from pydantic import BaseModel, Field

class ResponseLimit(BaseModel):
    min_words: int
    max_words: int
    reason: str

Instructions:

Ensure your final answer contains only the content in the following format: {
  "min_words": int,
  "max_words": int,
  "reason": str
}

However if i try to add some details for pydantic_output similar what we do for tools with args_schema like this:

Schema with descriptions:

from pydantic import BaseModel, Field

class ResponseLimit(BaseModel):
    min_words: int = Field(..., description="Minimum number of words")
    max_words: int = Field(..., description="Maximum number of words")
    reason: str = Field(..., description="Reason this decision is made.")

It will still have the same instructions as listed above.

Describe the solution you'd like

I would love for tool to detect if Fields are assigned and include it as JSON schema, something like that:

from pydantic import BaseModel, Field

class ResponseLimit(BaseModel):
    min_words: int = Field(..., description="Minimum number of words")
    max_words: int = Field(..., description="Maximum number of words")
    reason: str = Field(..., description="Reason this decision is made")

It will include instructions like this:

Ensure your final answer follows this JSON schema: {
  "min_words": {'description': 'Minimum number of words', 'type': 'int'},
  "max_words": {'description': 'Maximum number of words', 'type': 'int'},
  "reason": {'description': 'Reason this decision is made', 'type': 'str'},
}

Describe alternatives you've considered

Of cause you can always manually describe on expected_output each field, but it would be nice if you don't have to do that and it is taken from pydantic ouput as well.

Additional context

No response

Willingness to Contribute

Yes, I'd be happy to submit a pull request

Saicheg avatar Feb 21 '25 11:02 Saicheg

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Mar 23 '25 12:03 github-actions[bot]

Still waiting for someone from core team to take a look!

Saicheg avatar Mar 23 '25 13:03 Saicheg

@Saicheg this is interesting! Do you have any use case where this change would bring a clear and strong benefit?

lucasgomide avatar Apr 15 '25 20:04 lucasgomide

@Saicheg this is interesting! Do you have any use cases where this change would bring a clear and strong benefit?

It's been a while since I first proposed this. But I guess we can expand on an example I listed initially.

One part of the flow I currently have in production is an attempt to limit the response to be comfortable enough for the user. For that, I have some flow and expect the following pydantic_output:

from pydantic import BaseModel, Field

class ResponseLimit(BaseModel):
    min_words: int
    max_words: int
    reason: str

You can notice the reason field here. My team uses this field in traces to enhance instructions and algorithms weekly. But sometimes LLM just returns an enormous reason ( 3-5 sentences ), which affects response time and tokens cost. Ideally, I want 1-2 sentences for the reason field. Ideally i want it to be the following:

from pydantic import BaseModel, Field

class ResponseLimit(BaseModel):
    min_words: int = Field(..., description="Minimum number of words")
    max_words: int = Field(..., description="Maximum number of words")
    reason: str = Field(..., description="Reason behind decision made. 1-2 sentences.")

Saicheg avatar Apr 17 '25 08:04 Saicheg

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar May 17 '25 12:05 github-actions[bot]

Still waiting for feedback from core team

Saicheg avatar May 19 '25 09:05 Saicheg

Not from the core team though, but I feel adding additional context like this, might make it more confusing for the LLM.

Here is one the issue you can reference to : https://github.com/crewAIInc/crewAI/issues/2826#issuecomment-2879735803

TLDR; The same idea is used in the args_schema in the tool class, which infact sometimes, makes llm halucinate and provide a input dictionary with description key.

I do feel the providing the description would be helpful, but needs to be executed in a different way.

Vidit-Ostwal avatar May 19 '25 12:05 Vidit-Ostwal

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Jun 19 '25 12:06 github-actions[bot]

This issue was closed because it has been stalled for 5 days with no activity.

github-actions[bot] avatar Jun 24 '25 12:06 github-actions[bot]