← back to stream

Kafka consumers

#backend#kafka

A consumer reads from Kafka, usually as part of a consumer group. The loop is: poll the broker for a batch of messages from assigned partitions, process them, commit offsets, repeat. Things to get right: idempotent processing — assume every message can be delivered more than once (at-least-once is the default) and design handlers so replaying is safe; commit discipline — commit offsets after successful processing, not before, or you'll lose messages on crashes; rebalance behaviour — when a consumer joins/leaves, partitions reshuffle and in-flight work on revoked partitions can be wasted (modern cooperative rebalancing reduces this pain but doesn't eliminate it); poison messages — a message that always fails processing will stall a partition forever unless you have a dead-letter-queue strategy. Batch processing (eachBatch) is usually 2-5x faster than message-at-a-time because of commit and network amortisation.