agents icon indicating copy to clipboard operation
agents copied to clipboard

agents system

Open deathbyknowledge opened this issue 2 months ago • 3 comments

WIP

deathbyknowledge avatar Nov 24 '25 10:11 deathbyknowledge

⚠️ No Changeset found

Latest commit: c3a0a43c9c0d15bf830dff663339f52bf35c1c1d

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

changeset-bot[bot] avatar Nov 24 '25 10:11 changeset-bot[bot]

Claude Code Review

Status: WIP - Large agent system implementation

This PR introduces a comprehensive agent orchestration system built on Cloudflare Workers and Durable Objects. Given the WIP status, here are substantive issues to address:

Critical Issues

1. Race condition in subagent cancellation (packages/agents/src/sys/agent/index.ts:263-291)

The cancel operation iterates over waiting subagents without proper atomicity. Between fetching waitingSubagents and completing cancellation, new subagents could spawn or complete, leading to inconsistent state.

  • Fix: Wrap the entire cancel logic in blockConcurrencyWhile (like childResult does at line 543)

2. Memory leak in Store caching (packages/agents/src/sys/agent/store.ts:167-205)

appendMessages invalidates cache but doesn't clear it for other mutations like editFile. This creates inconsistency where some operations maintain stale caches while others don't.

  • Impact: File edits may not be visible to subsequent reads
  • Fix: Consistently invalidate or use a unified cache strategy

3. Missing error handling in async middleware hooks (packages/agents/src/sys/agent/index.ts:466-477)

step() catches errors but individual middleware hooks in executePendingTools (lines 373, 415-419) don't have try-catch. A failing onToolStart could crash the entire agent.

  • Fix: Wrap middleware hook calls in try-catch

4. SQL injection via string concatenation (packages/agents/src/sys/agent/store.ts:316)

readFile uses template literal in SQL query. While the path comes from internal sources now, this is a security antipattern.

  • Fix: Use parameterized queries consistently: exec('SELECT content FROM files WHERE path = ?', [path])

Architecture Issues

5. Unbounded recursion risk in agent loops

No explicit limit on run steps. A misbehaving agent could exhaust resources with infinite tool calls.

  • Recommendation: Add maxSteps config per blueprint with circuit breaker

6. No retry logic for inter-DO communication

Subagent spawning (/child_result calls) can fail due to network issues. Failures leave parent in paused state indefinitely.

  • Recommendation: Add retry with exponential backoff or timeout-based recovery

Testing Gaps

7. Missing tests for concurrent operations

Tests don't verify concurrent invoke calls, race conditions between cancel and childResult, or multiple subagents completing simultaneously.

8. No test coverage for Store cache invalidation

The cache invalidation bugs aren't caught because tests don't verify read-after-write consistency across different Store methods.

Minor Issues

  • Type safety: Info type uses optional threadId but code assumes it's always set after registration (line 117, 303)
  • Documentation: Complex subagent flow in architecture.md doesn't mention cancellation propagation behavior
  • Performance: listMessages() and listEvents() rebuild full arrays on every call even when cached (lines 208, 454)

Positive Notes

  • Strong separation of concerns (Store, SystemAgent, middleware)
  • Comprehensive event emission for observability
  • Good WebSocket integration for real-time updates
  • Extensive test coverage for happy paths

Next Steps: Focus on concurrency correctness and error handling before expanding features.

claude[bot] avatar Nov 24 '25 10:11 claude[bot]

Open in StackBlitz

npm i https://pkg.pr.new/cloudflare/agents@675

commit: 85db67a

pkg-pr-new[bot] avatar Nov 24 '25 10:11 pkg-pr-new[bot]