markitdown icon indicating copy to clipboard operation
markitdown copied to clipboard

Fix: Add Timeout Parameter to HTTP Requests in MarkItDown Class

Open sammydeprez opened this issue 4 months ago • 2 comments

This pull request addresses Issue #1167, where HTTP/HTTPS requests could hang indefinitely if the server fails to close the stream. To prevent this, a timeout parameter has been added to the requests.get() call within the convert_uri method.

🔧 Change Summary Before:

response = self._requests_session.get(uri, stream=True)

After:

response = self._requests_session.get(uri, stream=True, timeout=kwargs.get("timeout", None))

This change allows users to optionally specify a timeout via kwargs, ensuring better control over request behavior and avoiding indefinite hangs.

✅ Benefits

  • Prevents hanging on unresponsive servers
  • Enables customizable timeout handling
  • Backward-compatible: defaults to None if not provided

sammydeprez avatar Sep 09 '25 11:09 sammydeprez

any updates on merging this?

jerpint avatar Oct 16 '25 18:10 jerpint

There are so many needed PRs open, no one that is merging them 😞

sammydeprez avatar Oct 26 '25 04:10 sammydeprez