kafka-python icon indicating copy to clipboard operation
kafka-python copied to clipboard

Producer thread crashes if client hits fd limit

Open dpkp opened this issue 9 years ago • 1 comments

We should handle this a bit more gracefully.

Uncaught error in kafka producer I/O thread
Traceback (most recent call last):
  File "venv/lib/python3.5/site-packages/kafka/producer/sender.py", line 55, in run
    self.run_once()
  File "venv/lib/python3.5/site-packages/kafka/producer/sender.py", line 145, in run_once
    self._client.poll(poll_timeout_ms, sleep=True)
  File "venv/lib/python3.5/site-packages/kafka/client_async.py", line 407, in poll
    metadata_timeout_ms = self._maybe_refresh_metadata()
  File "venv/lib/python3.5/site-packages/kafka/client_async.py", line 631, in _maybe_refresh
_metadata
    self._maybe_connect(node_id)
  File "venv/lib/python3.5/site-packages/kafka/client_async.py", line 247, in _maybe_connect
    conn.connect()
  File "venv/lib/python3.5/site-packages/kafka/conn.py", line 140, in connect
    self._sock = socket.socket(self.afi, socket.SOCK_STREAM)
  File "/usr/local/lib/python3.5/socket.py", line 134, in __init__
    _socket.socket.__init__(self, family, type, proto, fileno)
OSError: [Errno 24] Too many open files

dpkp avatar Jun 16 '16 18:06 dpkp

I ran into this (or something similar) with regards to "OSError: [Errno 0] Error". Some things that would make this error more useful:

  • "Uncaught error" isn't very descriptive. At least logging the Exception description would be helpful -- Existing exception handling looks like this: except Exception: log.exception("Uncaught error in kafka producer I/O thread") -- This would be an improvement except Exception as error: log.exception("Uncaught error in kafka producer I/O thread: ", error)
  • This error reporting may need to be throttled. I don't have a production ready idea of how to approach this (this appears to be similar / same to #2100) -- run_once() in sender.py can be called very often (10 / sec isn't unheard of for me). If the error isn't a one time thing, this will flood the logs, making them useless with the log shown in the description above being repeated 1000s of times or more.

ErikBrewster avatar Jul 06 '21 18:07 ErikBrewster