consume speed control and rebalance questions
Versions
Please specify real version numbers or git SHAs, not just "Latest" since that changes fairly regularly.
| Sarama | Kafka | Go |
|---|---|---|
| 1.29.1 | 1.1.1 | 1.17.2 |
Configuration
What configuration values are you using for Sarama and Kafka?
cfg.Consumer.Offsets.Initial = sarama.OffsetNewest
cfg.Consumer.Return.Errors = false
cfg.Consumer.Offsets.AutoCommit.Interval = 3 * time.Second
cfg.Consumer.MaxWaitTime = time.Second
cfg.Consumer.Fetch.Default = 524288
cfg.Consumer.Fetch.Max = 1048576
cfg.Consumer.Group.Rebalance.Timeout = 10 * time.Second
cfg.Net.MaxOpenRequests = 1
cfg.Net.KeepAlive = 1 * time.Minute
cfg.Metadata.Retry.Max = 1
cfg.Metadata.Retry.Backoff = 1000 * time.Millisecond
cfg.Metadata.RefreshFrequency = 5 * time.Minute
cfg.Metadata.Full = false
Logs
When filing an issue please provide logs from Sarama and Kafka if at all
possible. You can set sarama.Logger to a log.Logger to capture Sarama debug
output.
logs: CLICK ME
well, if consume one message cost too long (such as 3s), there will be some errors like "consumer is not correct in current generation"
in this situation message will be consumed repeated, and consume state is in rebalance
a simple way to recur this is write a simple consume code and in consume chan loop to sleep 3 second
Problem Description
I try to debug sarama, print some logs and learn some about kafka then deduce that consume time too long cause client and broker disconnected from each other so that generation is timeout, but what I cannot understand is I see there will be a goroutine to send loop heartbeat to broker to keep alive, it is dependent with consumer goroutine, why is client still disconnected? Is it because I only use sess.MarkMessage in consume loop but msg commit is in another goroutine, too long time there's no commit invoked? Besides reducing processing latency, is there any other way to handle this?
The second problem is that we want to control consume speed such as use a rate limiter for each partition, but I'm worried that the first problem will occur if rate limiter cause consume slow.