claude-code icon indicating copy to clipboard operation
claude-code copied to clipboard

[BUG] Parallel subagent execution freezes due to SQLite lock contention in __store.db

Open KCW89 opened this issue 1 month ago • 5 comments

Description

When spawning multiple parallel subagents (e.g., 3 Explore agents simultaneously), Claude Code freezes indefinitely. The root cause is SQLite lock contention on ~/.claude/__store.db due to suboptimal default PRAGMA settings.

Environment

  • Claude Code Version: Latest (Dec 2025)
  • OS: macOS 14.x (Darwin 24.5.0)
  • Node.js: v20+
  • SQLite: System 3.43.2

Root Cause Analysis

The internal database ~/.claude/__store.db uses:

PRAGMA journal_mode;  -- Returns: delete
PRAGMA busy_timeout;  -- Returns: 0

Why This Causes Freezes

Setting Current Problem
journal_mode=DELETE Writers acquire EXCLUSIVE lock, blocking all other connections
busy_timeout=0 Any lock contention returns SQLITE_BUSY immediately (no retry)

When 3+ subagents run in parallel:

  1. All agents try to read/write to __store.db simultaneously
  2. DELETE journal mode requires EXCLUSIVE lock for writes
  3. busy_timeout=0 causes instant SQLITE_BUSY errors
  4. Claude Code likely retries in a tight loop → apparent freeze

Steps to Reproduce

  1. Start Claude Code
  2. Enter plan mode or request a task that spawns multiple parallel agents
  3. Ask: "Search for authentication patterns AND MDX processing AND database integration" (spawns 3 Explore agents)
  4. Observe: Claude Code freezes, no progress

Workaround (User-Applied Fix)

Enable WAL mode on the database:

sqlite3 ~/.claude/__store.db "PRAGMA journal_mode=WAL;"

After applying this fix, parallel agents complete successfully.

Evidence

Before fix:

$ sqlite3 ~/.claude/__store.db "PRAGMA journal_mode;"
delete

After fix:

$ sqlite3 ~/.claude/__store.db "PRAGMA journal_mode;"
wal

$ ls ~/.claude/__store.db*
__store.db
__store.db-shm   # WAL shared memory
__store.db-wal   # Write-ahead log

Result: 3 parallel Explore agents completed without freezing.

Suggested Fix

In the better-sqlite3 database initialization code, add:

// When opening the database
const db = new Database(path.join(claudeDir, '__store.db'));

// Enable WAL mode for concurrent access
db.pragma('journal_mode = WAL');

// Set reasonable busy timeout (5 seconds)
db.pragma('busy_timeout = 5000');

Why WAL Mode?

Mode Readers Writers Concurrency
DELETE Block writers Block everything Poor
WAL Don't block Append to log Excellent

WAL (Write-Ahead Logging) allows:

  • Multiple readers simultaneously
  • One writer without blocking readers
  • Better crash recovery
  • No filesystem lock contention

Additional Context

Files Involved

  • ~/.claude/__store.db - Main state database
  • ~/.claude/__store.db-shm - WAL shared memory (created after fix)
  • ~/.claude/__store.db-wal - Write-ahead log (created after fix)

Diagnostic Commands

# Check current settings
sqlite3 ~/.claude/__store.db "PRAGMA journal_mode; PRAGMA busy_timeout;"

# Apply fix
sqlite3 ~/.claude/__store.db "PRAGMA journal_mode=WAL;"

# Verify WAL files created
ls -la ~/.claude/__store.db*

Impact

  • Severity: High - blocks core parallel agent functionality
  • Workaround: Available (user can apply WAL mode manually)
  • Affected Users: Anyone using parallel subagents (plan mode, multi-agent tasks)

KCW89 avatar Dec 16 '25 06:12 KCW89