如果我不提任何问题,kafka如何确定分区

时间:2018-07-25 01:16:13

标签: apache-kafka

这就是我产生消息的方式:

String json = gson.toJson(msg);

ProducerRecord<String, String> record = new ProducerRecord<>(kafkaProducerConfig.getTopic(), json);
long startTime = System.currentTimeMillis();

try {
    RecordMetadata meta = producer.send(record).get(5, TimeUnit.SECONDS);
} catch (InterruptedException e) {
    e.printStackTrace();
} catch (ExecutionException e) {
    e.printStackTrace();
} catch (TimeoutException e) {
    e.printStackTrace();
}

我为此主题有15个分区,在生成时我没有提到分区键,默认的分区是什么?

2 个答案:

答案 0 :(得分:2)

Since you're sending no key as part of the record, it is null.

Kafka has a DefaultPartitioner that will round-robin any null keys over each partition.

For non-null keys, a Murmur2 hash is computed, then modulo'd by the number of partitions for the topic

答案 1 :(得分:1)

如果未定义任何自定义分区,它将按照以下规则使用默认分区程序

  1. 如果在记录中指定了分区,则使用该分区来 发布。
  2. 如果未指定分区但存在密钥,请选择一个分区 基于密钥的哈希值
  3. 如果不存在分区或键,请在     循环时尚

在默认的分区实现下,可以更好地了解

public int partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster) {
        List<PartitionInfo> partitions = cluster.partitionsForTopic(topic);
        int numPartitions = partitions.size();
        if (keyBytes == null) {
            int nextValue = nextValue(topic);
            List<PartitionInfo> availablePartitions = cluster.availablePartitionsForTopic(topic);
            if (availablePartitions.size() > 0) {
                int part = Utils.toPositive(nextValue) % availablePartitions.size();
                return availablePartitions.get(part).partition();
            } else {
                // no partitions are available, give a non-available partition
                return Utils.toPositive(nextValue) % numPartitions;
            }
        } else {
            // hash the keyBytes to choose a partition
            return Utils.toPositive(Utils.murmur2(keyBytes)) % numPartitions;
        }
    }