我有一个带有3个分区的Kafka主题( test-topic ),以及一组消息,其中包含只能使用3种类型的值的键,我希望将这些消息分开根据其值进行分区。
from kafka import KafkaProducer
from kafka.partitioner import DefaultPartitioner
messages = [{"partition_key":"k1", "x":1},
{"partition_key":"k2", "x":2},
{"partition_key":"k3", "x":3},
{"partition_key":"k1", "x":4},
{"partition_key":"k2", "x":5}]
partitioner = DefaultPartitioner()
all_partitions = list(range(100))
available = all_partitions
dataPartitioner = partitioner(b'partition_key', all_partitions, available)
producer = KafkaProducer(bootstrap_servers="localhost:9092", value_serializer=lambda v: json.dumps(v).encode('utf-8'), partitioner = dataPartitioner)
for m in messages:
producer.send('test-topic', m)
producer.flush()
在上面的代码中,我希望 partition_key 值相同的邮件进入同一分区。
答案 0 :(得分:0)
您需要编写Partitioner
接口的自定义实现,并在初始化时将该类提供给KafkaProducer
。
例如,
private static Properties createProducerConfig(String brokers) {
Properties props = new Properties();
props.put("bootstrap.servers", brokers);
//more properties
props.put("partitioner.class","com.app.KafkaUserCustomPatitioner");
return props;
}