python-sdk icon indicating copy to clipboard operation
python-sdk copied to clipboard

Race condition in streamable http

Open AydarAkhmetzyanov opened this issue 2 months ago • 0 comments

Fix race condition in Streamable HTTP transport

Motivation and Context

When using the Streamable HTTP transport, session.list_tools() intermittently returns empty results immediately after session.initialize() completes.

Root Cause: In mcp/client/streamable_http.py, memory streams are created with buffer size 0 (unbuffered), and post_writer is started with tg.start_soon() which doesn't wait for the task to be ready:

tg.start_soon(transport.post_writer, ...)  # Doesn't wait for task to start

This creates a timing issue where subsequent requests can fail before post_writer is ready to receive messages.

Solution: Use tg.start() instead of tg.start_soon() to ensure post_writer is fully ready before yielding from the context manager. This follows the existing pattern used in the SSE transport.

How Has This Been Tested?

Added two new tests:

  • test_streamablehttp_no_race_condition_on_consecutive_requests - 10 iterations of init → list_tools
  • test_streamablehttp_rapid_request_sequence - 20 rapid consecutive requests

All 36 streamable HTTP tests pass.

Breaking Changes

None. This is an internal implementation fix that doesn't change any public API.

Types of changes

  • [x] Bug fix (non-breaking change which fixes an issue)
  • [ ] New feature (non-breaking change which adds functionality)
  • [ ] Breaking change (fix or feature that would cause existing functionality to change)
  • [ ] Documentation update

Checklist

  • [x] I have read the MCP Documentation
  • [x] My code follows the repository's style guidelines
  • [x] New and existing tests pass locally
  • [x] I have added appropriate error handling
  • [ ] I have added or updated documentation as needed

Additional context

Three options were considered for fixing this issue:

  • Option A: Use a small buffer size instead of 0
  • Option B: Use tg.start() instead of tg.start_soon() ✅ (implemented)
  • Option C: Add an internal ready signal/event

Option B was chosen because it's minimal, follows existing patterns (SSE transport), and preserves buffer/backpressure semantics.

AydarAkhmetzyanov avatar Nov 26 '25 21:11 AydarAkhmetzyanov