通过密钥将卡夫卡中的消息推送到单独的分区

时间:2018-07-25 05:56:47

标签: apache-kafka kafka-producer-api kafka-python

我有一个带有3个分区的Kafka主题( test-topic ),以及一组消息,其中包含只能使用3种类型的值的键,我希望将这些消息分开根据其值进行分区。

from kafka import KafkaProducer
from kafka.partitioner import DefaultPartitioner

messages = [{"partition_key":"k1", "x":1},
            {"partition_key":"k2", "x":2},
            {"partition_key":"k3", "x":3},
            {"partition_key":"k1", "x":4},
            {"partition_key":"k2", "x":5}]

partitioner = DefaultPartitioner()
all_partitions = list(range(100))
available = all_partitions
dataPartitioner = partitioner(b'partition_key', all_partitions, available)

producer = KafkaProducer(bootstrap_servers="localhost:9092", value_serializer=lambda v: json.dumps(v).encode('utf-8'), partitioner = dataPartitioner)

for m in messages:
  producer.send('test-topic', m)
producer.flush()

在上面的代码中,我希望 partition_key 值相同的邮件进入同一分区。

1 个答案:

答案 0 :(得分:0)

您需要编写Partitioner接口的自定义实现,并在初始化时将该类提供给KafkaProducer

例如,

 private static Properties createProducerConfig(String brokers) {
    Properties props = new Properties();
    props.put("bootstrap.servers", brokers);
    //more properties
    props.put("partitioner.class","com.app.KafkaUserCustomPatitioner");
    return props;
    }