[BUG] Parallel subagent execution freezes due to SQLite lock contention in __store.db

Open KCW89 opened this issue 1 month ago • 5 comments

Description

When spawning multiple parallel subagents (e.g., 3 Explore agents simultaneously), Claude Code freezes indefinitely. The root cause is SQLite lock contention on ~/.claude/__store.db due to suboptimal default PRAGMA settings.

Environment

Claude Code Version: Latest (Dec 2025)
OS: macOS 14.x (Darwin 24.5.0)
Node.js: v20+
SQLite: System 3.43.2

Root Cause Analysis

The internal database ~/.claude/__store.db uses:

PRAGMA journal_mode;  -- Returns: delete
PRAGMA busy_timeout;  -- Returns: 0

Why This Causes Freezes

Setting	Current	Problem
`journal_mode=DELETE`	❌	Writers acquire EXCLUSIVE lock, blocking all other connections
`busy_timeout=0`	❌	Any lock contention returns SQLITE_BUSY immediately (no retry)

When 3+ subagents run in parallel:

All agents try to read/write to __store.db simultaneously
DELETE journal mode requires EXCLUSIVE lock for writes
busy_timeout=0 causes instant SQLITE_BUSY errors
Claude Code likely retries in a tight loop → apparent freeze

Steps to Reproduce

Start Claude Code
Enter plan mode or request a task that spawns multiple parallel agents
Ask: "Search for authentication patterns AND MDX processing AND database integration" (spawns 3 Explore agents)
Observe: Claude Code freezes, no progress

Workaround (User-Applied Fix)

Enable WAL mode on the database:

sqlite3 ~/.claude/__store.db "PRAGMA journal_mode=WAL;"

After applying this fix, parallel agents complete successfully.

Evidence

Before fix:

$ sqlite3 ~/.claude/__store.db "PRAGMA journal_mode;"
delete

After fix:

$ sqlite3 ~/.claude/__store.db "PRAGMA journal_mode;"
wal

$ ls ~/.claude/__store.db*
__store.db
__store.db-shm   # WAL shared memory
__store.db-wal   # Write-ahead log

Result: 3 parallel Explore agents completed without freezing.

Suggested Fix

In the better-sqlite3 database initialization code, add:

// When opening the database
const db = new Database(path.join(claudeDir, '__store.db'));

// Enable WAL mode for concurrent access
db.pragma('journal_mode = WAL');

// Set reasonable busy timeout (5 seconds)
db.pragma('busy_timeout = 5000');

Why WAL Mode?

Mode	Readers	Writers	Concurrency
DELETE	Block writers	Block everything	Poor
WAL	Don't block	Append to log	Excellent

WAL (Write-Ahead Logging) allows:

Multiple readers simultaneously
One writer without blocking readers
Better crash recovery
No filesystem lock contention

Additional Context

Files Involved

~/.claude/__store.db - Main state database
~/.claude/__store.db-shm - WAL shared memory (created after fix)
~/.claude/__store.db-wal - Write-ahead log (created after fix)

Diagnostic Commands

# Check current settings
sqlite3 ~/.claude/__store.db "PRAGMA journal_mode; PRAGMA busy_timeout;"

# Apply fix
sqlite3 ~/.claude/__store.db "PRAGMA journal_mode=WAL;"

# Verify WAL files created
ls -la ~/.claude/__store.db*

Impact

Severity: High - blocks core parallel agent functionality
Workaround: Available (user can apply WAL mode manually)
Affected Users: Anyone using parallel subagents (plan mode, multi-agent tasks)

Dec 16 '25 06:12 KCW89