kafka-python icon indicating copy to clipboard operation
kafka-python copied to clipboard

Idle Producer Socket Disconnected Errors

Open mubeta06 opened this issue 4 years ago • 2 comments

We are running python Kafka producer clients in a serverless environment (AWS Lambda python3.8 runtime) whereby (as per recommended best practices) we are sharing long-lived producer connections across invocations. We use a producer configuration similar to the following:

serializer = lambda m: json.dumps(m).encode('utf-8')
self._producer = kafka.KafkaProducer(
            'bootstrap_servers': os.environ['BOOTSTRAP_SERVERS'],
            'client_id': 'kafka-python-producer-optimus-prime',
            'security_protocol': 'SASL_SSL',
            'sasl_mechanism': 'PLAIN',
            'sasl_plain_username': os.environ['SASL_USERNAME'],
            'sasl_plain_password': os.environ['BROKER_ACCESS_KEY'],
            'socket_options': [(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1),
                                            (socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)],
            acks='all',
            compression_type='gzip',
            retries=0,
            max_request_size=33554432, # 32MB
            value_serializer=serializer,
            key_serializer=serializer)

Everything is working as expected with the exception that every so often we see a number of errors similar to the following:

[ERROR] 2021-11-08T02:25:29.644Z dbd0e301.....-2f86be94e839 <BrokerConnection node_id=bootstrap-0 host=....cloud:9092 <connected> [IPv4 ('52.....31', 9092)]>: socket disconnected

The errors appear to be associated with producer connections that we have yet to utilise for sending messages (we conditionally send messages upon invocation). In other words we have instantiated the producer = kafka.KafkaProducer object but are yet to call producer.send(...). We see this error message approximately 10 minutes after instantiating the producer which aligns with the Kafka broker cluster connections.max.idle.ms configuration of 600000 (i.e. 10 minutes). Increasing the verbosity of logging does not seem to provide any further insight beyond the aforementioned error log message.

It was my understanding is that the IdleConnectionManager (https://github.com/dpkp/kafka-python/blob/f0a57a6a20a3049dc43fbf7ad9eab9635bd2c0b0/kafka/client_async.py#L974) was responsible for the client-side handling of the situation where broker connections were deemed idle to prevent such server-side disconnections from happening. Is my understanding correct here or is this an issue with the Kafka python client implementation?

mubeta06 avatar Nov 08 '21 04:11 mubeta06

🤦 we are using version 2.0.2 of kafka-python

mubeta06 avatar Nov 08 '21 04:11 mubeta06

Hi Did you fix the issue? we are facing the same issue.

GRAWS avatar Apr 25 '24 13:04 GRAWS