claude-code [Feature Request] Lazy loading for @ file references in CLAUDE.md

Problem

Currently, @ file references in CLAUDE.md load all referenced content into context immediately at conversation start, regardless of relevance to the current task. This prevents users from maintaining comprehensive, modular documentation without significant token overhead.

Current Behavior

When using @ references to create a modular CLAUDE.md structure:

# ~/.claude/CLAUDE.md
Core development principles...

For detailed testing guidelines see @~/.claude/docs/testing.md
For TypeScript guidelines see @~/.claude/docs/typescript.md
For workflow details see @~/.claude/docs/workflow.md

All referenced files are loaded into every conversation's context, consuming tokens upfront even when they're never relevant to the task at hand.

Evidence

Research into documentation and existing issues confirms:

@ References Load All Content: From Issue #1041, user reported: "the system provided both files - the CLAUDE.md file that references general.md, and then the actual contents of general.md itself" as separate context entries.
No Lazy Loading Mechanism: The official best practices mention subdirectory CLAUDE.md files loading "on-demand when working with files in those locations," but this:
- Only applies to subdirectory files (not @ references)
- Is currently broken per Issue #2571
Verified Token Impact: A modular CLAUDE.md structure with 6 referenced files (~2,100 lines total) consumes the same tokens as a monolithic file, providing organizational benefits only.

Use Case

Many users organize development guidelines into modular documentation for maintainability:

~/.claude/
├── CLAUDE.md (core principles - 150 lines)
└── docs/
    ├── testing.md (comprehensive testing guidelines - 270 lines)
    ├── typescript.md (TypeScript rules and patterns - 305 lines)
    ├── code-style.md (style guide - 370 lines)
    ├── workflow.md (TDD workflow - 671 lines)
    └── examples.md (pattern examples - 278 lines)

Current token cost: ~2,100 lines loaded for every conversation
Desired token cost: ~150 lines (core) + relevant sections only when needed

Real-world impact:

Testing guidelines loaded when editing bash scripts
TypeScript rules loaded when writing markdown
Full workflow docs loaded for simple questions
85-90% of loaded content is irrelevant to most conversations

Impact

Token waste: Most conversations don't need comprehensive guidelines for all topics
Higher costs: Every conversation consumes 10-15x more tokens than necessary
Response latency: Larger context windows to process
Scalability limits: Users must choose between comprehensive docs OR efficient token usage
Workaround burden: Users forced to manually manage what should be automatic

Proposed Solutions

Option 1: Explicit Lazy Loading Syntax

Introduce syntax to distinguish between always-loaded and lazy-loaded references:

# Always loaded (current behavior)
@~/.claude/docs/core-principles.md

# Lazy loaded - Claude loads when relevant
@lazy:~/.claude/docs/testing.md
@lazy:~/.claude/docs/typescript.md
@lazy:~/.claude/docs/workflow.md

Claude would auto-load lazy references based on:

User message keywords ("write a test" → load testing.md)
File context (editing .ts files → load typescript.md)
Explicit mention ("load testing guidelines")

Option 2: Smart Context-Aware Loading

Claude automatically analyzes conversation context and loads @ references only when relevant:

# CLAUDE.md
## Additional Resources
- Testing: @~/.claude/docs/testing.md
- TypeScript: @~/.claude/docs/typescript.md

Auto-loading triggers:

User explicitly requests content ("show me testing guidelines")
Keywords detected in conversation ("how do I test this?")
File types match (editing .ts → load typescript.md)
Tool usage patterns (running jest → load testing.md)

Option 3: Metadata-Only Loading

Load only metadata (title, description, keywords) at startup, full content on-demand:

# testing.md
---
keywords: [test, tdd, jest, vitest, coverage]
description: Comprehensive testing guidelines including TDD workflow
---

[Full content loaded only when keywords match or explicitly requested]

Option 4: Namespace Integration

Auto-create slash commands from @ references:

Testing: @~/.claude/docs/testing.md
→ Automatically creates `/testing` command for on-demand loading

Users can explicitly invoke with /testing or Claude can suggest when relevant.

Current Workaround

The only workaround is manually creating slash commands in ~/.claude/commands/, which provides true lazy loading but:

❌ Requires manual frontmatter for each file
❌ Loses automatic context intelligence (user must remember to invoke)
❌ Duplicates content management (docs vs commands)
❌ No smart auto-loading based on conversation context

This forces users to choose: comprehensive guidelines with high token cost OR minimal guidelines with missing context.

Expected Behavior

With lazy loading implemented:

# Scenario 1: Writing tests
User: "Help me write a test for the payment processor"
→ Claude detects "test" keyword
→ Auto-loads @lazy:~/.claude/docs/testing.md
→ Applies comprehensive testing principles
→ Token cost: 150 (core) + 270 (testing) = 420 lines

# Scenario 2: Fixing TypeScript errors  
User: "Fix this TypeScript error in payment.ts"
→ Claude detects .ts file context + "TypeScript" keyword
→ Auto-loads @lazy:~/.claude/docs/typescript.md
→ Applies TypeScript guidelines
→ Token cost: 150 (core) + 305 (typescript) = 455 lines

# Scenario 3: General bash question
User: "How do I configure git aliases?"
→ No triggers for specialized guidelines
→ Uses core principles only
→ Token cost: 150 lines

Average savings: 85-90% token reduction per conversation

Benefits

✅ Token efficiency: Load only relevant documentation
✅ Better performance: Smaller initial context windows
✅ Scalability: Support comprehensive docs without hitting limits
✅ Maintains modularity: Keep organizational benefits, add token savings
✅ Improved UX: Faster responses with reduced latency
✅ Cost optimization: Significant reduction in API token consumption
✅ Backward compatible: Existing @ references continue working (immediate loading)

Related Issues

#2571 - Subdirectory CLAUDE.md files not loading (documented lazy loading is broken)
#1041 - @file imports confirmation and approval flow issues
#722 - CLAUDE.md discovery documentation inconsistencies

Priority Justification

High - This feature gap forces users to choose between:

✅ Comprehensive guidelines + ❌ High token costs + ❌ Slower responses
✅ Low token costs + ❌ Missing critical context + ❌ Incomplete guidance

Lazy loading enables both comprehensive AND efficient documentation, unlocking:

Better adherence to complex development standards
Reduced costs for teams with detailed guidelines
Improved response times with smaller context windows
Scalable knowledge management

Implementation Notes

For Claude Code team consideration:

Backward compatibility: Default to current behavior (immediate loading) for existing @ references
Opt-in syntax: Use new syntax (@lazy: or metadata) for lazy loading
Smart detection: Leverage existing keyword matching/semantic analysis
User control: Allow manual loading via mention or command
Memory integration: Coordinate with existing CLAUDE.md memory system
Token tracking: Add /context visibility for lazy-loaded files

Environment

Claude Code Version: [Verified behavior in latest version as of 2025]
OS: macOS (Darwin 25.1.0)
Setup: Global ~/.claude/CLAUDE.md with @ references to ~/.claude/docs/*.md
Verified: hasClaudeMdExternalIncludesApproved flag status doesn't affect loading (all files load regardless)

Nov 16 '25 23:11 citypaul

See also my feature request that aligns with the concerns and suggestions in this issue, and provides some different context and additional suggestions.

My issue gets more into the benefits of standardization and official support + documentation.

Dec 03 '25 12:12 JeffreyUrban

This issue has been inactive for 30 days. If the issue is still occurring, please comment to let us know. Otherwise, this issue will be automatically closed in 30 days for housekeeping purposes.

Jan 04 '26 10:01 github-actions[bot]

Isn't this what skills are for? You'll get the "lazy" progressive loading you're looking for, and you need the skills structure in order to give the agent context. Without that, it would have no idea when it needs to load a particular document.

Jan 05 '26 03:01 memetican