Kafka Message Key Hashing
In Kafka each event message contains an optional key and a value.
- key == null : messages are distributed evenly across partitions in a topic (a round-robin strategy).
- key != null : All messages that share same key will always be sent and stored in the same partition.
** A key can be anything to identify a message (a string, numeric value, binary value etc.).
Key Hashing is the process of determining the mapping of a key to a partition. A Kafka partitioner is a code logic that takes a record and determines to which partition to send it into.

** In the default Kafka partitioner, the keys are hashed using murmur2 algorithm (murmur hash).
targetPartition = Math.abs(Utils.murmur2(keyBytes) % (numPartitions - 1))Example Flow:
- Message 1:
account_id = 12345hash(12345) % 2might result in0-> Partition 0 - Message 2:
account_id = 67890hash(67890) % 2might result in1-> Partition 1 - Message 3:
account_id = 12345(same as Message 1)hash(12345) % 2still results in0-> Partition 0 again, ensuring that all messages withaccount_id = 12345are processed in the same order.
Benefits of Keyed Partitioning:
- Message Ordering: By sending all messages with the same key to the same partition, Kafka ensures that they are processed in the order they were sent.
- Data Locality: Grouping related messages into the same partition can also improve performance by ensuring that they are processed together.
Yorumlar
Yorum Gönder