Streams经常重新创建商店

时间:2019-03-01 09:42:49

标签: apache-kafka-streams

在流应用程序中,我正在使用交互式查询和状态存储,以便进行扩展并能够更快地使用主题中的数据。但是我经常在日志中看到警告:

anomaly-timeline-3                    | 2019-03-01 08:43:58,177 INFO 
anomaly-timeline-a3b6b7d6-3bd8-40a6-b070-874964bed3ee-StreamThread-2 org.apache.kafka.streams.processor.internals.StreamThread stream-thread [anomaly-timeline-a3b6b7d6-3bd8-40a6-b070-874964bed3ee-StreamThread-2] Reinitializing StandbyTask TaskId: 1_0
anomaly-timeline-3                    |         ProcessorTopology:
anomaly-timeline-3                    |                 KSTREAM-SOURCE-0000000012:
anomaly-timeline-3                    |                         topics:         [anomaly-timeline-two-minutes-error-score-repartition]
anomaly-timeline-3                    |                         children:       [KSTREAM-REDUCE-0000000009]
anomaly-timeline-3                    |                 KSTREAM-REDUCE-0000000009:
anomaly-timeline-3                    |                         states:         [two-minutes-error-score]
anomaly-timeline-3                    | Partitions [anomaly-timeline-two-minutes-error-score-repartition-0]
anomaly-timeline-3                    |  from changelogs [anomaly-timeline-two-minutes-error-score-changelog-0]
anomaly-timeline-3                    | 2019-03-01 08:43:58,474 INFO anomaly-timeline-a3b6b7d6-3bd8-40a6-b070-874964bed3ee-StreamThread-2 org.apache.kafka.clients.consumer.internals.Fetcher [Consumer clientId=anomaly-timeline-a3b6b7d6-3bd8-40a6-b070-874964bed3ee-StreamThread-2-restore-consumer, groupId=] Resetting offset for partition anomaly-timeline-two-minutes-error-score-changelog-0 to offset 14787709.
anomaly-timeline-3                    | 2019-03-01 08:48:57,991 WARN anomaly-timeline-a3b6b7d6-3bd8-40a6-b070-874964bed3ee-StreamThread-2 org.apache.kafka.streams.processor.internals.StreamThread stream-thread [anomaly-timeline-a3b6b7d6-3bd8-40a6-b070-874964bed3ee-StreamThread-2] Updating StandbyTasks failed. Deleting StandbyTasks stores to recreate from scratch. org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets out of range with no configured reset policy for partitions: {anomaly-timeline-one-hour-error-score-changelog-0=14818811}
anomaly-timeline-3                    |         at org.apache.kafka.clients.consumer.internals.Fetcher.parseCompletedFetch(Fetcher.java:1002)
anomaly-timeline-3                    |         at org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:508)
anomaly-timeline-3                    |         at org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1259)
anomaly-timeline-3                    |         at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1187)
anomaly-timeline-3                    |         at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1154)
anomaly-timeline-3                    |         at org.apache.kafka.streams.processor.internals.StreamThread.maybeUpdateStandbyTasks(StreamThread.java:1099)
anomaly-timeline-3                    |         at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:846)
anomaly-timeline-3                    |         at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:767)
anomaly-timeline-3                    |         at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:736)
anomaly-timeline-3                    |
anomaly-timeline-3                    | org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets out of range with no configured reset policy for partitions: {anomaly-timeline-one-hour-error-score-changelog-0=14818811}
anomaly-timeline-3                    |         at org.apache.kafka.clients.consumer.internals.Fetcher.parseCompletedFetch(Fetcher.java:1002)
anomaly-timeline-3                    |         at org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:508)
anomaly-timeline-3                    |         at org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1259)
anomaly-timeline-3                    |         at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1187)
anomaly-timeline-3                    |         at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1154)
anomaly-timeline-3                    |         at org.apache.kafka.streams.processor.internals.StreamThread.maybeUpdateStandbyTasks(StreamThread.java:1099)
anomaly-timeline-3                    |         at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:846)
anomaly-timeline-3                    |         at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:767)
anomaly-timeline-3                    |         at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:736)
anomaly-timeline-3                    | 2019-03-01 08:48:57,995 INFO anomaly-timeline-a3b6b7d6-3bd8-40a6-b070-874964bed3ee-StreamThread-2 org.apache.kafka.streams.processor.internals.StreamThread stream-thread [anomaly-timeline-a3b6b7d6-3bd8-40a6-b070-874964bed3ee-StreamThread-2] Reinitializing StandbyTask TaskId: 3_0
anomaly-timeline-3                    |         ProcessorTopology:
anomaly-timeline-3                    |                 KSTREAM-SOURCE-0000000022:
anomaly-timeline-3                    |                         topics:         [anomaly-timeline-one-hour-error-score-repartition]
anomaly-timeline-3                    |                         children:       [KSTREAM-REDUCE-0000000019]
anomaly-timeline-3                    |                 KSTREAM-REDUCE-0000000019:
anomaly-timeline-3                    |                         states:         [one-hour-error-score]
anomaly-timeline-3                    | Partitions [anomaly-timeline-one-hour-error-score-repartition-0]
anomaly-timeline-3                    |  from changelogs [anomaly-timeline-one-hour-error-score-changelog-0]
anomaly-timeline-3                    | 2019-03-01 08:48:58,303 INFO anomaly-timeline-a3b6b7d6-3bd8-40a6-b070-874964bed3ee-StreamThread-2 org.apache.kafka.clients.consumer.internals.Fetcher [Consumer clientId=anomaly-timeline-a3b6b7d6-3bd8-40a6-b070-874964bed3ee-StreamThread-2-restore-consumer, groupId=] Resetting offset for partition anomaly-timeline-one-hour-error-score-changelog-0 to offset 14818854.

因此,由于某种原因,Kafka似乎正在重新初始化待机任务,然后更新失败。如果我了解日志记录,这可能会导致从头开始重新创建商店。

所以我的问题是:

  • 即使这些是警告,kafka似乎也没有正常运行。这个假设正确吗?
  • 为什么这个StandbyTask失败了?
  • 是否正在删除我的实际变更日志状态存储?
  • 我应该如何为该流线程配置重置策略?
  • 为什么要为此变更日志重置偏移量?

1 个答案:

答案 0 :(得分:0)

  

即使这些是警告,kafka似乎也没有正常运行。这个假设正确吗?

是的

  

为什么这个StandbyTask失败了?

似乎StandbyTask是从无效的偏移量获取的。但这并不是真的失败。

  

是否正在删除我的实际变更日志状态存储?

在这种情况下,仅删除本地存储,changelog主题不受影响。本地存储被删除,因为它与changelog主题不同步。这样可以从头开始重新创建商店。

  

我应该如何配置该流线程的重置策略?

您无法为还原使用者配置重置策略。如果发生上述情况,Kafka Streams将删除本地存储,并在changelog主题上删除seeksToBeginning(),以从头开始重新创建存储。

  

为什么要为此变更日志重置偏移量?

也许StandbyTask落后了?

您可以尝试为org.apache.kafka.streams.processor.internals.ProcessorStateManager启用TRACE日志记录。在提交时写入的本地检查点文件中跟踪StandyTask的偏移量。偏移量在提交时记录:

log.trace("Writing checkpoint: {}", this.checkpointableOffsets);

这应该有助于弄清楚StandbyTask是否落后。对于这种情况,也许您需要更多的线程或更多的实例来避免这种情况。