评论后的笔记
我使用的是Kafka的dockerized版本和Kafka Streams作业。这些都是通过Docker Compose连续生成的。
问题描述:
就我的论文而言,我正在使用Kafka Streams。一切正常,但流作业实际开始处理需要一段时间。我花了一段时间才弄清楚原因。显然,直到它从RUNNING到REBALACING返回RUNNING状态之前,它都不会处理。有人知道这是为什么吗,我该怎么做才能立即开始处理?也许我缺少一些配置。
我正在使用Confluent的KAFKA REST API将测试消息提交到输入主题。
在开始实际处理之前的日志如下:
INFO org.apache.kafka.clients.Metadata - Cluster ID: aKDudbTgTSq9gY-M6eHqyw
INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=aggregation-item-brand-prototype-823ba6d0-5b02-4a92-ac64-592b6d3e4188-StreamThread-1-consumer, groupId=aggregation-item-brand-prototype] Revoking previously assigned partitions []
INFO org.apache.kafka.streams.processor.internals.StreamThread - stream-thread [aggregation-item-brand-prototype-823ba6d0-5b02-4a92-ac64-592b6d3e4188-StreamThread-1] State transition from RUNNING to PARTITIONS_REVOKED
INFO org.apache.kafka.streams.KafkaStreams - stream-client [aggregation-item-brand-prototype-823ba6d0-5b02-4a92-ac64-592b6d3e4188] State transition from RUNNING to REBALANCING
INFO org.apache.kafka.streams.processor.internals.StreamThread - stream-thread [aggregation-item-brand-prototype-823ba6d0-5b02-4a92-ac64-592b6d3e4188-StreamThread-1] partition revocation took 0 ms.
suspended active tasks: []
suspended standby tasks: []
INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - [Consumer clientId=aggregation-item-brand-prototype-823ba6d0-5b02-4a92-ac64-592b6d3e4188-StreamThread-1-consumer, groupId=aggregation-item-brand-prototype] (Re-)joining group
INFO org.apache.kafka.streams.processor.internals.StreamsPartitionAssignor - stream-thread [aggregation-item-brand-prototype-823ba6d0-5b02-4a92-ac64-592b6d3e4188-StreamThread-1-consumer] Assigned tasks to clients as {823ba6d0-5b02-4a92-ac64-592b6d3e4188=[activeTasks: ([1_0]) standbyTasks: ([]) assignedTasks: ([1_0]) prevActiveTasks: ([]) prevStandbyTasks: ([]) prevAssignedTasks: ([]) capacity: 1]}.
WARN org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=aggregation-item-brand-prototype-823ba6d0-5b02-4a92-ac64-592b6d3e4188-StreamThread-1-consumer, groupId=aggregation-item-brand-prototype] The following subscribed topics are not assigned to any members: [product-brand-withkey]
INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - [Consumer clientId=aggregation-item-brand-prototype-823ba6d0-5b02-4a92-ac64-592b6d3e4188-StreamThread-1-consumer, groupId=aggregation-item-brand-prototype] Successfully joined group with generation 2
INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=aggregation-item-brand-prototype-823ba6d0-5b02-4a92-ac64-592b6d3e4188-StreamThread-1-consumer, groupId=aggregation-item-brand-prototype] Setting newly assigned partitions [product-item-withkey-0]
INFO org.apache.kafka.streams.processor.internals.StreamThread - stream-thread [aggregation-item-brand-prototype-823ba6d0-5b02-4a92-ac64-592b6d3e4188-StreamThread-1] State transition from PARTITIONS_REVOKED to PARTITIONS_ASSIGNED
INFO org.apache.kafka.streams.processor.internals.StreamThread - stream-thread [aggregation-item-brand-prototype-823ba6d0-5b02-4a92-ac64-592b6d3e4188-StreamThread-1] partition assignment took 19 ms.
current active tasks: [1_0]
current standby tasks: []
previous active tasks: []
INFO org.apache.kafka.clients.consumer.internals.Fetcher - [Consumer clientId=aggregation-item-brand-prototype-823ba6d0-5b02-4a92-ac64-592b6d3e4188-StreamThread-1-consumer, groupId=aggregation-item-brand-prototype] Resetting offset for partition product-item-withkey-0 to offset 0.
INFO org.apache.kafka.streams.processor.internals.StreamThread - stream-thread [aggregation-item-brand-prototype-823ba6d0-5b02-4a92-ac64-592b6d3e4188-StreamThread-1] State transition from PARTITIONS_ASSIGNED to RUNNING
INFO org.apache.kafka.streams.KafkaStreams - stream-client [aggregation-item-brand-prototype-823ba6d0-5b02-4a92-ac64-592b6d3e4188] State transition from REBALANCING to RUNNING
答案 0 :(得分:0)
我发现问题已经发布。这是问题所在:Why does Kafka consumer takes long time to start consuming?
我正在使用Kafka的dockerized版本。 Kafka和Kafka Streams工作一起完成。尽管这些主题是由Kafka预先创建的,但当Kafka Streams工作开始消耗时,这些主题仍处于领导者选举阶段。这导致无法获取有关主题和消费消息的任何元数据。仅在由 metadata.max.age.ms 参数控制的元数据刷新后,消费者才真正开始消费。
我通过在Kafka Streams作业的开始shell脚本中放置30秒的睡眠来等待选举来解决此问题。现在开始立即食用