[FEATURE] Dynamic/Lazy Agent Loading to Reduce Context Token Usage
Preflight Checklist
- [x] I have searched existing requests and this feature hasn't been requested yet
- [x] This is a single feature request (not multiple features)
Problem Statement
supporting dynamic agent loading where:
- A registry agent queries available agents
- System loads only the needed agents into context on-demand
- Agents are unloaded when no longer needed
Problem
Currently, Claude Code loads all agent descriptions at startup and includes them in every prompt. With multiple specialized agents, this causes significant token bloat that impacts performance.
Example: I have ~16.2k tokens of agent descriptions loaded, exceeding the 15k recommendation, even though only 2-3 agents are typically needed for any given task.
Current Limitations
- All agents load upfront regardless of whether they're needed
- Large agent collections consume context window unnecessarily
- Users must manually enable/disable agents via
/agentsbetween sessions - No way to dynamically load agents based on task requirements
Proposed Solution
Implement dynamic/lazy agent loading where:
-
Registry Agent Pattern: A lightweight "agent registry" agent that:
- Has minimal token footprint (~200-500 tokens)
- Knows available agents and their capabilities
- Analyzes the user's request to determine which agents are actually needed
- Loads only relevant agents into context on-demand
-
Runtime Agent Management:
- Agents loaded mid-conversation when needed
- Agents unloaded when task is complete
- Only active agents consume context tokens
Benefits
- Dramatically reduced baseline token usage
- Better performance for users with large agent collections
- Ability to maintain extensive agent libraries without performance penalty
- More scalable agent ecosystem
Alternative Approaches
- Agent groups/profiles that can be switched dynamically
- Token budget limits per agent suite
- Automatic agent pruning based on relevance scoring
Use Case
I want 50+ specialized agents available but only load the 2-3 relevant ones per task, keeping my context usage minimal.
Proposed Solution
I'd like to have hundreds of very specialized agents and commands available to claude code without having to load them all into context. I could move them to another folder and have a reference agent, but i'd prefer there to be some kind of default functionality; such that claude will look to the registry agent similar to how it looks to claude.md first.
Alternative Solutions
No response
Priority
Critical - Blocking my work
Feature Category
CLI commands and flags
Use Case Example
No response
Additional Context
No response
Found 3 possible duplicate issues:
- https://github.com/anthropics/claude-code/issues/4973
- https://github.com/anthropics/claude-code/issues/6272
- https://github.com/anthropics/claude-code/issues/7336
This issue will be automatically closed as a duplicate in 3 days.
- If your issue is a duplicate, please close it and 👍 the existing issue instead
- To prevent auto-closure, add a comment or 👎 this comment
🤖 Generated with Claude Code
i believe mine is the only one that doesn't suggest the use of an MCP server to address it and my approach to it is different. My context and reasoning are also quite different.
Seems to me that #10896 would solve many of the related issues/suggestions that people have raised, like this one.
It also seems more in-line with engineering practices of componentization. It also seems easier to implement than all of the "more dynamic" solutions being proposed.
Not that those aren't valuable, but I think making plug-in resources private (tools, servers, commands, agents, etc) as proposed in #10896 would be a simpler, more immediate solution for this exact token bloat problem.
This issue has been inactive for 30 days. If the issue is still occurring, please comment to let us know. Otherwise, this issue will be automatically closed in 30 days for housekeeping purposes.