improvement(mcp): restructure mcp tools caching/fetching info to improve UX
Summary
Show "unavailable" badge when MCP tool's server is disconnected/error/missing or tool removed from server Show "stale" badge when tool schema changed or server URL changed Unavailable tools filtered from agent execution and dropdown Issues shown in both agent tool input AND MCP settings modal Filter unavailable/stale tools from reaching Agent Fix caching bug that was causing discovery on every execution
Event driven discovery of tools:
- Page load - useMcpToolsQuery on component mount (30s staleTime)
- Server creation - useCreateMcpServer onSuccess (forceRefresh=true)
- Server refresh - useRefreshMcpServer onSuccess (forceRefresh=true)
- Server deletion - Query invalidation
- Execution fallback - Agent handler when tool has no cached schema
- Copilot - When fetching workspace data
Type of Change
- [x] Bug fix
- [x] Other: performance improvement / UX
Testing
Tested manually
Checklist
- [x] Code follows project style guidelines
- [x] Self-reviewed my changes
- [x] Tests added/updated and passing
- [x] No new warnings introduced
- [x] I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)
The latest updates on your projects. Learn more about Vercel for GitHub.
Greptile Summary
Restructured MCP tools caching and validation to significantly improve UX and performance. The changes introduce a robust two-tier caching system (Redis with in-memory fallback), event-driven tool discovery, and comprehensive tool validation with visual status indicators.
Key improvements:
-
Caching Infrastructure: Added Redis cache with graceful fallback to in-memory cache (
apps/sim/lib/mcp/storage/*). Cache properly handles TTL, corruption, and cleanup. Fixed bug where tools were being discovered on every execution. -
Tool Validation: Created centralized validation logic (
lib/mcp/tool-validation.ts) that detects server disconnection, tool removal, schema changes, and URL changes. Shared across UI components for consistency. - UX Enhancements: Tools from unavailable servers now show "unavailable" or "stale" badges in the workflow editor with clickable links to settings. Unavailable tools are filtered from dropdowns and agent execution.
-
Status Tracking: Added
statusConfigfield to track consecutive failures per server. Servers transition to 'error' state after 3 consecutive failures. Status updates happen in background without blocking discovery. - Event-Driven Discovery: Tool discovery now triggered on specific events (page load with 30s stale time, server creation with force refresh, server refresh, execution fallback). Prevents unnecessary re-discovery.
-
API Improvements: New
/api/mcp/tools/storedendpoint scans workflows to find stored MCP tool configurations for validation. Refresh endpoint only clears cache on successful connection.
Architecture: The PR follows clean separation of concerns with the cache adapter pattern, centralized validation functions, and proper React Query integration for optimistic updates.
Confidence Score: 4/5
- Safe to merge with minor caching optimization consideration
- Well-architected changes with comprehensive error handling, graceful fallbacks, and clean separation of concerns. One logical issue identified in the partial cache handling that affects performance but not correctness. The Redis fallback pattern is properly implemented, tool validation logic is sound, and UI integration is thorough. Database migration is minimal and safe.
- Pay attention to
apps/sim/lib/mcp/service.tsfor the caching behavior when some servers fail
Important Files Changed
| Filename | Overview |
|---|---|
| apps/sim/lib/mcp/storage/factory.ts | Factory pattern for cache adapter selection with graceful fallback from Redis to memory |
| apps/sim/lib/mcp/tool-validation.ts | Centralized validation logic for detecting tool issues (server status, schema changes, URL changes) |
| apps/sim/lib/mcp/service.ts | Integrated Redis/memory caching, added statusConfig tracking for consecutive failures, improved error handling |
| apps/sim/executor/handlers/agent/agent-handler.ts | Added filtering to exclude MCP tools from disconnected/error servers before agent execution |
| apps/sim/hooks/queries/mcp.ts | Added useStoredMcpTools query, improved cache invalidation on server mutations with forceRefresh |
| apps/sim/app/workspace/[workspaceId]/w/[workflowId]/components/panel/components/editor/components/sub-block/components/tool-input/tool-input.tsx | Integrated tool validation to show unavailable/stale badges, filters unavailable tools from dropdown, added settings modal links |
Sequence Diagram
sequenceDiagram
participant User
participant UI as Tool Input UI
participant Modal as MCP Settings Modal
participant Query as React Query
participant API as API Routes
participant Service as MCP Service
participant Cache as Redis/Memory Cache
participant DB as Database
participant MCP as MCP Server
Note over User,MCP: Page Load - Tool Discovery
User->>UI: Open workflow editor
UI->>Query: useMcpToolsQuery(workspaceId)
Query->>API: GET /api/mcp/tools/discover
API->>Service: discoverTools(userId, workspaceId)
Service->>Cache: get(workspace:id)
alt Cache Hit
Cache-->>Service: Return cached tools
Service-->>API: Return tools
else Cache Miss
Service->>DB: Get enabled servers
DB-->>Service: Server configs
Service->>MCP: List tools (parallel)
MCP-->>Service: Tool schemas
Service->>DB: Update server status
Service->>Cache: set(workspace:id, tools, 5min)
Service-->>API: Return discovered tools
end
API-->>Query: Tools data
Query->>UI: Render available tools
Note over User,MCP: Server Creation - Force Refresh
User->>Modal: Add new MCP server
Modal->>API: POST /api/mcp/servers
API->>DB: Create server record
API->>Service: clearCache(workspaceId)
Service->>Cache: delete(workspace:id)
API-->>Modal: Server created
Modal->>Query: onSuccess - fetchMcpTools(forceRefresh=true)
Query->>API: GET /api/mcp/tools/discover?refresh=true
Note right of API: forceRefresh bypasses cache
API->>Service: discoverTools(forceRefresh=true)
Service->>MCP: List tools
MCP-->>Service: Fresh tool data
Service->>Cache: Update cache
Service-->>API: Fresh tools
API-->>Query: setQueryData(tools)
Query-->>Modal: Tools updated
Note over User,MCP: Tool Validation - UI Badges
UI->>Query: useMcpServers(workspaceId)
Query-->>UI: Server states
UI->>UI: getMcpToolIssue(stored tool)
Note right of UI: Compare stored vs current state
alt Server Disconnected/Error
UI->>UI: Show "unavailable" badge
UI->>UI: Filter from dropdown
else Schema/URL Changed
UI->>UI: Show "stale" badge
UI->>UI: Keep in dropdown
end
User->>UI: Click badge
UI->>Modal: openSettingsModal(serverId)
Modal->>Modal: Auto-select server
Note over User,MCP: Agent Execution - Filter Tools
User->>UI: Run workflow
UI->>API: Execute agent block
API->>Service: AgentHandler.execute()
Service->>DB: Check server connectionStatus
DB-->>Service: Server states
Service->>Service: filterUnavailableMcpTools()
Note right of Service: Remove tools from disconnected servers
Service->>MCP: Execute with filtered tools
MCP-->>Service: Results
Service-->>API: Agent output
API-->>UI: Display results
Note over User,MCP: Server Refresh - Status Tracking
User->>Modal: Click refresh button
Modal->>API: POST /api/mcp/servers/:id/refresh
API->>Service: discoverServerTools(serverId)
Service->>MCP: List tools
alt Connection Success
MCP-->>Service: Tools
Service->>DB: Update statusConfig {consecutiveFailures: 0}
Service->>Cache: clearCache(workspaceId)
else Connection Failed
MCP-->>Service: Error
Service->>DB: Increment statusConfig.consecutiveFailures
Note right of DB: Set to 'error' after 3 failures
end
Service-->>API: Status + tool count
API-->>Modal: Refresh complete
Modal->>Query: Invalidate & refetch
Query-->>Modal: Updated data
@greptile