Spring Cloud Stream Kafka Stream应用程序显示在每次重启时将分区事件x的重置偏移量重置为偏移量0

时间:2019-07-19 08:46:47

标签: apache-kafka apache-kafka-streams spring-cloud-stream confluent confluent-cloud

我有一个Spring Cloud Stream Kafka Stream应用程序,该应用程序读取主题(事件)并执行简单的处理:

@Configuration
class EventKStreamConfiguration {

    private val logger = LoggerFactory.getLogger(javaClass)

    @StreamListener
    fun process(@Input("event") eventStream: KStream<String, EventReceived>) {

        eventStream.foreach { key, value ->
            logger.info("--------> Processing Event {}", value)
            // Save in DB
        }
    }
}

此应用程序正在使用Confluent Cloud中的Kafka环境,其事件主题具有6个分区。完整的配置是:

spring:
  application:
    name: events-processor
  cloud:
    stream:
      schema-registry-client:
        endpoint: ${schema-registry-url:http://localhost:8081}
      kafka:
        streams:
          binder:
            brokers: ${kafka-brokers:localhost}
            configuration:
              application:
                id: ${spring.application.name}
              default:
                key:
                  serde: org.apache.kafka.common.serialization.Serdes$StringSerde
              schema:
                registry:
                  url: ${spring.cloud.stream.schema-registry-client.endpoint}
              value:
                subject:
                  name:
                    strategy: io.confluent.kafka.serializers.subject.RecordNameStrategy
              processing:
                guarantee: exactly_once
          bindings:
            event:
              consumer:
                valueSerde: io.confluent.kafka.streams.serdes.avro.SpecificAvroSerde
      bindings:
        event:
          destination: event

  data:
    mongodb:
      uri: ${mongodb-uri:mongodb://localhost/test}

server:
  port: 8085

logging:
  level:
    org.springframework.kafka.config: debug

---

spring:
  profiles: confluent-cloud
  cloud:
    stream:
      kafka:
        streams:
          binder:
            autoCreateTopics: false
            configuration:
              retry:
                backoff:
                  ms: 500
              security:
                protocol: SASL_SSL
              sasl:
                mechanism: PLAIN
                jaas:
                  config: xxx
              basic:
                auth:
                  credentials:
                    source: USER_INFO
              schema:
                registry:
                  basic:
                    auth:
                      user:
                        info: yyy

KStream正在正确处理消息。 如果我重新启动应用程序,则不会对其进行重新处理。注意:我不希望对它们进行重新处理,所以这种行为是可以的。

但是启动日志显示了一些奇怪的地方:

  1. 首先,它显示还原客户客户端的创建。使用自动偏移量重置无:
2019-07-19 10:20:17.120  INFO 82473 --- [           main] o.a.k.s.p.internals.StreamThread         : stream-thread [events-processor-9a8069c4-3fb6-4d76-a207-efbbadd52b8f-StreamThread-1] Creating restore consumer client
2019-07-19 10:20:17.123  INFO 82473 --- [           main] o.a.k.clients.consumer.ConsumerConfig    : ConsumerConfig values: 
    auto.commit.interval.ms = 5000
    auto.offset.reset = none
  1. 然后它创建一个具有最早的自动偏移重置功能的消费者客户端。
2019-07-19 10:20:17.235  INFO 82473 --- [           main] o.a.k.s.p.internals.StreamThread         : stream-thread [events-processor-9a8069c4-3fb6-4d76-a207-efbbadd52b8f-StreamThread-1] Creating consumer client
2019-07-19 10:20:17.241  INFO 82473 --- [           main] o.a.k.clients.consumer.ConsumerConfig    : ConsumerConfig values: 
    auto.commit.interval.ms = 5000
    auto.offset.reset = earliest
  1. 启动日志的最终跟踪显示偏移量重置为0。这在应用程序每次重新启动时发生:
2019-07-19 10:20:31.577  INFO 82473 --- [-StreamThread-1] o.a.k.s.p.internals.StreamThread         : stream-thread [events-processor-9a8069c4-3fb6-4d76-a207-efbbadd52b8f-StreamThread-1] State transition from PARTITIONS_ASSIGNED to RUNNING
2019-07-19 10:20:31.578  INFO 82473 --- [-StreamThread-1] org.apache.kafka.streams.KafkaStreams    : stream-client [events-processor-9a8069c4-3fb6-4d76-a207-efbbadd52b8f] State transition from REBALANCING to RUNNING
2019-07-19 10:20:31.669  INFO 82473 --- [events-processor] o.a.k.c.consumer.internals.Fetcher       : [Consumer clientId=events-processor-9a8069c4-3fb6-4d76-a207-efbbadd52b8f-StreamThread-1-consumer, groupId=events-processor] Resetting offset for partition event-3 to offset 0.
2019-07-19 10:20:31.669  INFO 82473 --- [events-processor] o.a.k.c.consumer.internals.Fetcher       : [Consumer clientId=events-processor-9a8069c4-3fb6-4d76-a207-efbbadd52b8f-StreamThread-1-consumer, groupId=events-processor] Resetting offset for partition event-0 to offset 0.
2019-07-19 10:20:31.669  INFO 82473 --- [events-processor] o.a.k.c.consumer.internals.Fetcher       : [Consumer clientId=events-processor-9a8069c4-3fb6-4d76-a207-efbbadd52b8f-StreamThread-1-consumer, groupId=events-processor] Resetting offset for partition event-1 to offset 0.
2019-07-19 10:20:31.669  INFO 82473 --- [events-processor] o.a.k.c.consumer.internals.Fetcher       : [Consumer clientId=events-processor-9a8069c4-3fb6-4d76-a207-efbbadd52b8f-StreamThread-1-consumer, groupId=events-processor] Resetting offset for partition event-5 to offset 0.
2019-07-19 10:20:31.670  INFO 82473 --- [events-processor] o.a.k.c.consumer.internals.Fetcher       : [Consumer clientId=events-processor-9a8069c4-3fb6-4d76-a207-efbbadd52b8f-StreamThread-1-consumer, groupId=events-processor] Resetting offset for partition event-4 to offset 0.
  1. 为什么要配置两个使用方?

  2. 为什么第二个我没有显式配置auto.offset.reset = earliest并且Kafka默认是最新的?

  3. 我想要默认的行为(auto.offset.reset = Latest),并且看起来工作正常。但是,这与我在日志中看到的并不矛盾吗?

更新:

我将这样重述第三个问题:为什么日志显示每次重新启动时分区都被重置为0,尽管如此,也没有消息重新传递到KStream?

1 个答案:

答案 0 :(得分:1)

  
      
  1. 配置两个使用者的原因是什么?
  2.   

Restore Consumer Client是容错和状态管理的专用使用者。它负责从变更日志主题还原状态。它与应用程序使用者客户端分开显示。您可以在这里找到更多信息 : https://docs.confluent.io/current/streams/monitoring.html#kafka-restore-consumer-client-id

  
      
  1. 为什么第二个我没有明确配置,而kafka默认是最新的,所以最早有auto.offset.reset =?
  2.   

您说得对,auto.offset.reset在Kafka Consumer中的默认值为latest。但是在Spring Cloud Stream中,使用者startOffset的默认值为earliest。因此,它在第二个使用者中显示earliest。同样,它取决于spring.cloud.stream.bindings.<channelName>.group绑定。如果明确设置,则startOffset设置为earliest,否则latest使用者将其设置为anonymous

参考:Spring Cloud Stream Kafka Consumer Properties

  
      
  1. 我想要默认(auto.offset.reset = Latest)行为,它   似乎工作正常。但是,这与我所看到的并不矛盾   日志?
  2.   

对于anonymous个消费者组,startOffset的默认值为latest