Apache Kafka消费者群体的抵消如何到期?

时间:2016-08-24 19:21:22

标签: apache-kafka

当我发现一些奇怪的行为时,我正在对一个老话题进行一些测试。阅读Kafka的日志我注意到这个“删除了8个过期的偏移”消息:

> ssh -vv -i pemfile.pem hadoop@xx.xx.xx.xx
OpenSSH_6.9p1, LibreSSL 2.1.8
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 21: Applying options for *
debug2: ssh_connect: needpriv 0
debug1: Connecting to xx.xx.xx.xx [xx.xx.xx.xx] port 22.
debug1: Connection established.
debug1: key_load_public: No such file or directory
debug1: identity file pemfile.pem type -1
debug1: key_load_public: No such file or directory
debug1: identity file pemfile.pem-cert type -1
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_6.9
ssh_exchange_identification: Connection closed by remote host

> ls -al pemfile.pem 
-rw-r--r--  1 user  staff  1692 Aug 24 15:09 pemfile.pem

事实上,我有两个问题:

  1. 这如何抵消消费者群体的到期工作?

  2. 这个过期的偏移量可以解释这种行为吗?我的消费者在[GroupCoordinator 1001]: Stabilized group GROUP_NAME generation 37 (kafka.coordinator.GroupCoordinator) [GroupCoordinator 1001]: Assignment received from leader for group GROUP_NAME for generation 37 (kafka.coordinator.GroupCoordinator) Deleting segment 0 from log __consumer_offsets-31. (kafka.log.Log) Deleting segment 0 from log __consumer_offsets-45. (kafka.log.Log) Deleting index /data/kafka-logs/__consumer_offsets-45/00000000000000000000.index.deleted (kafka.log.OffsetIndex) Deleting index /data/kafka-logs/__consumer_offsets-31/00000000000000000000.index.deleted (kafka.log.OffsetIndex) Deleting segment 0 from log __consumer_offsets-13. (kafka.log.Log) Deleting index /data/kafka-logs/__consumer_offsets-13/00000000000000000000.index.deleted (kafka.log.OffsetIndex) Deleting segment 0 from log __consumer_offsets-11. (kafka.log.Log) Deleting segment 4885 from log __consumer_offsets-11. (kafka.log.Log) Deleting index /data/kafka-logs/__consumer_offsets-11/00000000000000004885.index.deleted (kafka.log.OffsetIndex) Deleting index /data/kafka-logs/__consumer_offsets-11/00000000000000000000.index.deleted (kafka.log.OffsetIndex) Deleting segment 0 from log __consumer_offsets-26. (kafka.log.Log) Deleting segment 12406 from log __consumer_offsets-26. (kafka.log.Log) Deleting index /data/kafka-logs/__consumer_offsets-26/00000000000000012406.index.deleted (kafka.log.OffsetIndex) Deleting index /data/kafka-logs/__consumer_offsets-26/00000000000000000000.index.deleted (kafka.log.OffsetIndex) Deleting segment 0 from log __consumer_offsets-22. (kafka.log.Log) Deleting segment 8643 from log __consumer_offsets-22. (kafka.log.Log) Deleting index /data/kafka-logs/__consumer_offsets-22/00000000000000008643.index.deleted (kafka.log.OffsetIndex) Deleting index /data/kafka-logs/__consumer_offsets-22/00000000000000000000.index.deleted (kafka.log.OffsetIndex) Deleting segment 0 from log __consumer_offsets-6. (kafka.log.Log) Deleting segment 9757 from log __consumer_offsets-6. (kafka.log.Log) Deleting index /data/kafka-logs/__consumer_offsets-6/00000000000000000000.index.deleted (kafka.log.OffsetIndex) Deleting index /data/kafka-logs/__consumer_offsets-6/00000000000000009757.index.deleted (kafka.log.OffsetIndex) Deleting segment 0 from log __consumer_offsets-14. (kafka.log.Log) Deleting segment 1 from log __consumer_offsets-14. (kafka.log.Log) Deleting index /data/kafka-logs/__consumer_offsets-14/00000000000000000001.index.deleted (kafka.log.OffsetIndex) Deleting index /data/kafka-logs/__consumer_offsets-14/00000000000000000000.index.deleted (kafka.log.OffsetIndex) [GroupCoordinator 1001]: Preparing to restabilize group GROUP_NAME with old generation 37 (kafka.coordinator.GroupCoordinator) [GroupCoordinator 1001]: Stabilized group GROUP_NAME generation 38 (kafka.coordinator.GroupCoordinator) [GroupCoordinator 1001]: Assignment received from leader for group GROUP_NAME for generation 38 (kafka.coordinator.GroupCoordinator) [Group Metadata Manager on Broker 1001]: Removed 8 expired offsets in 1 milliseconds. (kafka.coordinator.GroupMetadataManager) 时不会轮询任何内容,但是当它有auto.offset.reset = latest时从最后一次提交的偏移量中进行了轮询?

2 个答案:

答案 0 :(得分:38)

Kafka默认情况下会在一段可配置的时间后删除已提交的偏移量。请参阅参数offsets.retention.minutes。即,如果消费者组在这段时间内处于非活动状态(即,不提交任何偏移),则偏移量将被删除。因此,即使消费者正在运行,如果它没有为某些分区提交偏移,那么这些偏移也会受到offset.retention.minutes的限制。

如果您启动消费者,则会发生以下情况:

  1. 寻找(有效的)承诺偏移量(针对消费者群体)
    1. 如果找到有效的偏移,则从那里继续
    2. 如果未找到有效偏移量,则根据auto.offset.reset参数
    3. 重置偏移量
  2. 因此,如果您的偏移量被删除并且auto.offset.reset = latest,则在将新数据添加到主题之前,您的消费者不会轮询任何内容。如果auto.offset.reset = earliest它应该消耗整个主题。

    有关此https://issues.apache.org/jira/browse/KAFKA-3806https://issues.apache.org/jira/browse/KAFKA-4682

    的讨论,请参阅此JIRA

答案 1 :(得分:0)

检查我的答案here。您不应忘记文件滚动。它会影响偏移文件的删除。