Question

我目前正在使用Spring Integration Kafka进行实时统计。但是，组名使Kafka搜索听众没有读过的所有先前值。

@Value("${kafka.consumer.group.id}")
private String consumerGroupId;

@Bean
public ConsumerFactory<String, String> consumerFactory() {
    return new DefaultKafkaConsumerFactory<>(getDefaultProperties());
}

public Map<String, Object> getDefaultProperties() {
    Map<String, Object> properties = new HashMap<>();
    properties.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);

    properties.put(ConsumerConfig.GROUP_ID_CONFIG, consumerGroupId);

    properties.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
    properties.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, ByteArrayDeserializer.class);
    return properties;
}

@Bean
public ConcurrentKafkaListenerContainerFactory<String, String> kafkaListenerContainerFactory() {

    ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
    factory.setConsumerFactory(consumerFactory());
    return factory;
}

@Bean
public KafkaMessageListener listener() {
    return new KafkaMessageListener();
}

我想开始最新的偏移，而不是被旧的价值所困扰。是否有可能重置组的偏移量？

Answer 1

因为我没有看到任何这方面的例子，我将解释我是如何在这里做的。

@KafkaListener的类必须实现一个ConsumerSeekAware类，这将允许侦听器控制分区归属时的偏移量。（来源：https://docs.spring.io/spring-kafka/reference/htmlsingle/#seek）

public class KafkaMessageListener implements ConsumerSeekAware {
    @KafkaListener(topics = "your.topic")
    public void listen(byte[] payload) {
        // ...
    }

    @Override
    public void registerSeekCallback(ConsumerSeekCallback callback) {

    }

    @Override
    public void onPartitionsAssigned(Map<TopicPartition, Long> assignments, ConsumerSeekCallback callback) {
        assignments.forEach((t, o) -> callback.seekToEnd(t.topic(), t.partition()));
    }

    @Override
    public void onIdleContainer(Map<TopicPartition, Long> assignments, ConsumerSeekCallback callback) {


    }
}

这里，在重新平衡时，我们使用给定的回调来寻找所有给定主题的最后一个偏移量。感谢Artem Bilan（https://stackoverflow.com/users/2756547/artem-bilan）指导我找到答案。

Answer 2

嗯，听起来你需要担心消费者的auto.offset.reset。但令我困惑的是，无论如何它都是latest：

auto.offset.reset   What to do when there is no initial offset in Kafka or if the current offset does not exist any more on the server (e.g. because that data has been deleted):

earliest: automatically reset the offset to the earliest offset
latest: automatically reset the offset to the latest offset
none: throw exception to the consumer if no previous offset is found for the consumer's group
anything else: throw exception to the consumer.

string  latest  [latest, earliest, none]    medium

Answer 3

您可以在订阅某些主题时为kafka使用者设置KafkaConsumer.endOffsets()，在其中您可以通过KafkaConsumer.seek()方法获得每个分区的最新偏移量，并通过{{ 1}}方法，如下所示：

kafkaConsumer.subscribe(Collections.singletonList(topics),
    new ConsumerRebalanceListener() {
        @Override
        public void onPartitionsRevoked(Collection<TopicPartition> partitions) {
            //do nothing
        }

        @Override
        public void onPartitionsAssigned(Collection<TopicPartition> partitions) {
            //get and set the lastest offset for each partiton
            kafkaConsumer.endOffsets(partitions) 
                .forEach((partition, offset) -> kafkaConsumer.seek(partition, offset));
        }
    }
);

Answer 4

您可以使用partitionOffsets批注以确切的偏移量开头，例如：

@KafkaListener(id = "bar", topicPartitions =
    { @TopicPartition(topic = "topic1", partitions = { "0", "1" }),
      @TopicPartition(topic = "topic2", partitions = "0",
         partitionOffsets = @PartitionOffset(partition = "1", initialOffset = "100"))
    })public void listen(ConsumerRecord<?, ?> record) {
     }

Answer 5

对于在kafka中没有初始偏移量的新消费者组，您可以设置AUTO_OFFSET_RESET_CONFIG：

properties.put(ConsumerConfig.GROUP_ID_CONFIG, "consumer-group-id");
properties.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "latest");

对于现有的消费群体，您可以：

更改组ID以显示为新的consumer-group-id-v2
实施ConsumerSeekAware，因此您可以在初始化See docs期间寻求所需的偏移量

Spring Kafka - 如何使用组ID将偏移重置为最新？

5 个答案: