Question

我正在尝试了解有关kafka流（kafka的kafka流客户端）的一些细节。

我知道KafkConsumer（Java客户端）将从kafka获取数据，但是我无法理解客户端以何种频率轮询kakfa主题以获取数据？

Answer 1

轮询频率由您的代码定义，因为您有责任调用轮询。使用KafkaConsumer的用户代码的一个非常幼稚的示例就像following

public class KafkaConsumerExample {
  ...


    static void runConsumer() throws InterruptedException {
        final Consumer<Long, String> consumer = createConsumer();

        final int giveUp = 100;   int noRecordsCount = 0;

        while (true) {
            final ConsumerRecords<Long, String> consumerRecords =
                    consumer.poll(1000);

            if (consumerRecords.count()==0) {
                noRecordsCount++;
                if (noRecordsCount > giveUp) break;
                else continue;
            }

            consumerRecords.forEach(record -> {
                System.out.printf("Consumer Record:(%d, %s, %d, %d)\n",
                        record.key(), record.value(),
                        record.partition(), record.offset());
            });

            consumer.commitAsync();
        }
        consumer.close();
        System.out.println("DONE");
    }
}

在这种情况下，频率由处理consumerRecords.forEach中消息的持续时间定义。

但是，请记住，如果您没有将民意调查称为“足够快”，那么经纪人协调员将认为您的消费者已经死亡，并且将触发重新平衡。这个“足够快”由kafka> = 0.10.1.0中的属性max.poll.interval.ms确定。有关更多详细信息，请参见this answer。

max.poll.interval.ms的默认值为5分钟，因此，如果您的consumerRecords.forEach花费的时间超过您的消费者，则将被视为死亡。

如果您不想直接使用原始的KafkaConsumer，则可以使用alpakka kafka，这是一个以 safe 和反压的方式从kafka主题消费和生产的库方式（基于Akka流）。
使用此库，轮询频率由配置akka.kafka.consumer.poll-interval确定。
我们说这是安全的，因为即使您的处理无法跟上速度，它也会继续轮询以避免消费者被视为死亡。之所以能够这样做，是因为KafkaConsumer允许暂停消费者

 /**
     * Suspend fetching from the requested partitions. Future calls to {@link #poll(Duration)} will not return
     * any records from these partitions until they have been resumed using {@link #resume(Collection)}.
     * Note that this method does not affect partition subscription. In particular, it does not cause a group
     * rebalance when automatic assignment is used.
     * @param partitions The partitions which should be paused
     * @throws IllegalStateException if any of the provided partitions are not currently assigned to this consumer
     */
    @Override
    public void pause(Collection<TopicPartition> partitions) { ... }

要完全理解这一点，您应该阅读有关akka流和反压的信息。

哪个kafka属性决定KafkaConsumer的投票频率？

1 个答案: