replicate-python icon indicating copy to clipboard operation
replicate-python copied to clipboard

Random SSL connection errors when downloading from replicate.delivery on Cloud Run

Open AusafG5 opened this issue 8 months ago • 3 comments

I’m using the Replicate Python client to create predictions with webhook callbacks. The webhook sends URLs pointing to images on replicate.delivery CDN. My backend downloads these images asynchronously using aiohttp in a Cloud Run environment.

The problem is, about half the time the downloads fail with this error:

Cannot connect to host replicate.delivery:443 ssl:default [None]

This never happens when I run the same code locally or through ngrok tunnels. I tried messing with SSL settings in aiohttp but no luck.

Here’s the snippet I use to download images:

headers = {
    'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
}

async with aiohttp.ClientSession() as session:
    async with session.get(image_url, headers=headers) as response:
        response.raise_for_status()
        image_data = await response.read()

Some extra info:

I can’t get file objects directly from predictions.get(), only URLs afaik.

I need async downloads because I get multiple webhook calls at once.

Cloud Run’s networking and DNS seem fine otherwise.

Looks like this might be some temporary SSL or network issue with replicate.delivery CDN?

Can you please let me know:

Are there any known connectivity issues with replicate.delivery from cloud platforms like Cloud Run?

Do you recommend any specific SSL or network settings to fix this?

Is there any way to get file-like objects directly from the Replicate API instead of URLs, so I can avoid downloading manually?``

AusafG5 avatar May 15 '25 09:05 AusafG5

Hi @AusafG5 👋

Thanks for the detailed report!

This seems to be related to intermittent SSL handshake issues between Cloud Run and the replicate.delivery CDN. These types of errors are often caused by the environment’s outbound networking layer rather than the CDN itself.

Here are a few things to try:

  1. Force aiohttp to use TLSv1.2 explicitly, which is more compatible with certain CDNs on Cloud Run:
import ssl
import aiohttp

ssl_context = ssl.create_default_context()
ssl_context.set_ciphers('DEFAULT:@SECLEVEL=1')

async with aiohttp.ClientSession() as session:
async with session.get(image_url, ssl=ssl_context, headers=headers) as response:
response.raise_for_status()
image_data = await response.read() 
  1. Consider retries with exponential backoff to handle transient SSL failures:

for attempt in range(3): try: async with session.get(image_url, ssl=ssl_context, headers=headers) as response: response.raise_for_status() return await response.read() except aiohttp.ClientConnectorError as e: await asyncio.sleep(2 ** attempt)

  1. Unfortunately, the Replicate API does not currently offer direct file-like access from .get() — only URLs are returned. So manual download is the only current method.

Let me know if this helps stabilize your downloads on Cloud Run! Thanks again 🙌

Ivan-developer0 avatar May 15 '25 15:05 Ivan-developer0

Hey @Ivan-developer0, thanks for the reply.

I've spent around 2 full days, trying to get this to work with no avail :(


Just putting for more context

Things I've tried includes combination of the following failed workarounds:

  • Retry Mechanism with Exponential backoffs ( using tenacity).
  • Trying with multiple client libs including aiohttp and asyncio. Tried different connector configurations (messing with timeouts, ssl, keepalives etc), iterating over the default to the most permissive (including the TLSv1.2) and restrictive ones as well.
  • Tried plain old synchronous requests as well in hopes of trading performance for function.
  • Providing runtime ssl contexts using certifi etc
  • what i didn't do was to disable ssl.

None of them have worked so far, to be fair tho, i don't even remember half of the stuff i did, so these might as well work for others.

Also i mailed replicate guys, got a reply, might help others

Image

Luckily for me I was able to replicate (xD) the entire flow with a local model instead. Future me will deal with this.

I would ask replicate guys tho: Is there a specific reason why predictions do not return File Outputs, or is it something we should expect in future?

Anyways thanx for the reply again 😄 .

AusafG5 avatar May 15 '25 16:05 AusafG5

Anyways thanx for the reply again 😄 .

You Well come :)

Ivan-developer0 avatar May 16 '25 09:05 Ivan-developer0