Add task decomposition feature (Issue #2717)
Task Decomposition Feature
This PR implements task decomposition as requested in Issue #2717. It allows complex tasks to be automatically split into sub-tasks without manual intervention.
Changes
- Added parent_task and sub_tasks fields to Task class
- Implemented decompose() method to create sub-tasks
- Added combine_sub_task_results() method to aggregate results
- Updated execute_sync() to handle sub-task execution
- Added execute_sub_tasks_async() for asynchronous execution
- Created tests for the task decomposition functionality
- Added example script demonstrating usage
Example Usage
# Create a complex task
research_task = Task(
description="Research the impact of AI on various industries",
expected_output="A comprehensive report covering multiple industries",
agent=researcher,
)
# Decompose the task into sub-tasks
sub_tasks = research_task.decompose(
descriptions=[
"Research AI impact on healthcare industry",
"Research AI impact on finance industry",
"Research AI impact on education industry",
],
expected_outputs=[
"A report on AI in healthcare",
"A report on AI in finance",
"A report on AI in education",
],
names=["Healthcare", "Finance", "Education"],
)
Testing
All tests pass, including the new tests for task decomposition.
Fixes #2717
Link to Devin run: https://app.devin.ai/sessions/86d77279c79a4137b7d7b6509cec79ca Requested by: Joe Moura ([email protected])
🤖 Devin AI Engineer
I'll be helping with this pull request! Here's what you should know:
✅ I will automatically:
- Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
- Look at CI failures and help fix them
Note: I can only respond to comments from users who have write access to this repository.
⚙️ Control Options:
- [ ] Disable automatic comment and CI monitoring
Disclaimer: This review was made by a crew of AI Agents.
Code Review for PR #2718 — Add Task Decomposition Feature
Summary of Key Findings
This PR adds a significant new feature to the CrewAI framework: automatic decomposition of complex tasks into multiple sub-tasks, with synchronous and asynchronous execution and result aggregation capabilities. The implementation is thoughtfully designed, leveraging Pydantic models with clear type hints and Google-style docstrings. The feature integrates naturally into existing task execution workflows, extending Task with parent_task and sub_tasks fields and new methods (decompose, combine_sub_task_results, and execute_sub_tasks_async).
Additionally, a new example demonstrates practical usage, and a comprehensive pytest-based test suite covers the core scenarios including structure, execution, aggregation, validation, and async execution.
Detailed Feedback and Improvement Suggestions
1. examples/task_decomposition_example.py
-
Strengths:
- Clear, concise demonstration of the new feature.
- Proper separation of concerns with Agent, Task, and Crew usage.
- Good use of standard Python best practices and PEP8 compliance.
-
Improvements:
-
Remove extraneous commented blank lines at the file end to clean up (lines 46-48).
-
Enhance result inspection by printing sub-task outputs for clarity, e.g.:
print("Final result:", result) for i, sub in enumerate(research_task.sub_tasks): print(f"Sub-task {i+1} result: {sub.output if hasattr(sub, 'output') else 'No output'}") -
Augment docstring to specify this feature's CrewAI version introduction to aid user reference.
-
2. src/crewai/task.py
-
Strengths:
- Modular, readable code supporting decomposition cleanly.
- Good input validation in
decompose. - Clear docstrings and use of type hints.
- Proper exclusion of recursive fields from copying to prevent cycles.
- Thoughtful handling of synchronous and asynchronous sub-task execution.
-
Critical and Major Recommendations:
-
Forward Reference Handling:
- The
parent_taskandsub_tasksuse'Task'as a string for self-reference. - To avoid runtime errors or schema issues in Pydantic v2, add at the end of the file:
Task.model_rebuild() - This call ensures forward references are properly resolved.
- The
-
Output Setting After Combining Sub-Task Results:
-
Currently, in
execute_sync(), after executing sub-tasks and combining results, the method calls_execute_core()but does not setself.output. -
This may cause re-execution or loss of combined result.
-
Change logic to set
self.outputdirectly with combined result, e.g.:if self.sub_tasks and not self.output: for sub_task in self.sub_tasks: sub_task.execute_sync( agent=sub_task.agent or agent, context=context, tools=sub_task.tools or tools or [], ) result = self.combine_sub_task_results() self.output = TaskOutput(raw=result) return self.output
-
-
Improve Validation Error Messages:
-
Include actual lengths in ValueError messages in
decomposefor easier debugging:raise ValueError( f"If provided, expected_outputs must have the same length as descriptions. " f"Got {len(expected_outputs)} and {len(descriptions)}." )
-
-
Clarify
combine_sub_task_resultsBehavior and Dependencies:- Currently assumes agent supports prompt-based summarization.
- Add docstring notice that aggregation requires an agent capable of coherent text summarization and is designed for stateless prompt execution.
- Consider adding capability checks or allowing override/custom aggregation strategies.
-
Expand Docstrings with Raises and Side Effects:
- Document exceptions like
ValueErrorin all relevant methods. - Note side effects such as modification of
self.sub_tasksindecomposeandself.outputinexecute_sync.
- Document exceptions like
-
Async Execution Aggregation (Future Work):
-
execute_sub_tasks_async()returns futures but no current method to await and aggregate async results. - Suggest implementing or documenting planned async aggregation for compositional task results.
-
-
Context Field Management:
- Linking sub-tasks to parent in
context=[self]aids traceability but potentially forms deep object graphs. - Clearly document this approach and warnings on recursion risk.
- Linking sub-tasks to parent in
-
-
Minor Style Improvements:
- Prefer consistent blank lines between methods following PEP8.
- Use private method docstrings for underscored methods.
3. tests/test_task_decomposition.py
-
Strengths:
- Well-structured tests use
pytestand mocks. - Cover core flows: parent-child linkage, sync execution, combining results, validation failures, and async task execution.
- Good isolation of agent execution via mocking.
- Well-structured tests use
-
Improvement Suggestions:
-
Add tests for:
- Calling
combine_sub_task_resultswhen no sub-tasks or missing outputs, verifying expectedValueError. - Confirm that after execution,
parent_task.outputis properly set with combined results. - Verify deep cloning does not lead to shared mutable state or context leak.
- Calling
-
Harmonize use of
pytest.raisesas context managers consistently. -
Assert actual output contents in mocks for stronger validation, e.g.:
for sub_task in parent_task.sub_tasks: assert getattr(sub_task.output, 'raw', None) == "Mock result"
-
Historical Context and Related Practices
- This approach builds on CrewAI's synchronous and asynchronous task execution foundations.
- Similar patterns of task hierarchy and result composition have been considered in earlier PRs but never at this integrated level.
- The new example and tests reflect evolving standards in CrewAI for modularity and agent orchestration.
- The use of prompt-based aggregation leverages language model capabilities consistently with CrewAI's agent design philosophy.
- Attention to copy method exclusion of recursive fields aligns with prior lessons on circular ref issues in Pydantic models.
Final Summary and Recommendations
This PR is a valuable and well-engineered enhancement that enables hierarchical task management in CrewAI, crucial for handling complex workflows. Its code quality is high, documentation mostly solid, with minor gaps in edge-case handling and forward reference management.
Please incorporate the following before merging:
- Add
Task.model_rebuild()at end oftask.pyto fix ForwardRef issues. - Modify
execute_sync()to setself.outputafter combining sub-task results instead of re-executing. - Improve validation error messages with actual lengths included.
- Enhance docstrings to document exceptions and side effects clearly.
- Clean commented out blank lines in example script.
- Add tests for edge cases (empty sub-tasks, output setting, async aggregation).
- Document context usage and the assumptions around agent-based result combining.
- Consider roadmap for async sub-task result aggregation.
With these changes, this PR will provide a robust, maintainable, and user-friendly feature that fits naturally into the CrewAI ecosystem.
Great work and thank you for this comprehensive and meaningful contribution!
If you want, I can also provide code snippets for these recommended changes or help with additional testing or documentation enhancements.
Closing due to inactivity for more than 7 days.