[FEATURE] Support client-initiated real-time human-input event stream (WebSocket/SSE/long-polling) for pending human input
Feature Area
Other (please specify in additional context)
Is your feature request related to a an existing bug? Please link it here.
Not a bug, builds on prior feature discussions that were closed as “not planned” (#654, #2051) but reframes the problem around offering alternative integration methods for human input delivery.
Describe the solution you'd like
Background / Problem
CrewAI currently signals that it needs human input (i.e., enters “Pending Human Input”) only via externally delivered webhooks. That creates integration friction in scenarios where hosting a publicly reachable webhook endpoint is hard or undesirable (local dev behind NAT, locked-down security environments, real-time stacks already using persistent connections, etc.). The goal here is not to “fallback” to something else, it’s to offer other integration methods—client-initiated, real-time delivery of the human-input pause event so consumers can subscribe without needing to expose endpoints externally. :contentReference[oaicite:1]{index=1}
Proposal
Introduce an optional, client-initiated event stream (e.g., WebSocket, Server-Sent Events, or long-polling) for receiving “pending_human_input” notifications for a given crew/execution. This would sit alongside (not necessarily replace) the webhook mechanism and give integrators flexibility in how they receive clarification requests from the agent.
Key capabilities
- Authenticated subscription per crew/execution using existing API credentials.
- Real-time delivery of human-input pause events over a persistent channel (WebSocket/SSE) or efficient poll-style fallback (long-polling) when a persistent connection isn’t feasible.
-
Structured event payload including:
-
event: e.g.,"pending_human_input" -
execution_id -
crew_id -
task_id -
prompt/ clarification question -
context/ relevant metadata -
reason_flags(why input was requested) -
event_id(deduplication) -
timestamp
-
-
Reconnect & resume semantics: clients can recover missed events using last-seen
event_id. - Ordering/dedupe support so integrations can safely handle retries or duplicate deliveries.
- Lightweight handshake example (WebSocket):
// Client opens:
wss\://api.crewai.com/v1/crews/{crew\_id}/human-input-stream
Authorization: Bearer <token>
// Server sends:
{
"event": "pending\_human\_input",
"execution\_id": "...",
"task\_id": "...",
"prompt": "Need more details about current traffic volume",
"context": { ... },
"reason\_flags": \["ambiguity", "missing\_field"],
"event\_id": "uuid",
"timestamp": "2025-08-02T16:00:00Z"
}
- Coexistence: If an integrator prefers or falls back to webhooks, that flow continues unchanged.
Describe alternatives you've considered
- Webhook-only delivery (current default): requires public endpoint exposure and often tunneling (ngrok) in constrained environments.
- Polling the execution status to detect when human input is needed: adds latency, inefficiency, and complexity around rate-limiting.
- Proxying webhook callbacks internally and then pushing over internal WebSocket/SSE: works but still depends on the publicly routable webhook and adds extra correlation layers.
- Hybrid long-polling/short-polling for “needs input” signals: viable in some constrained cases, but has scalability/latency trade-offs compared to persistent streams.
Additional context
No response
Willingness to Contribute
Yes, I'd be happy to submit a pull request
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
Commenting to keep this active.
stay active
Commenting to keep this active
Stay active
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
This issue was closed because it has been stalled for 5 days with no activity.