如何防止我的Kafka Streams应用程序进入ERROR状态?

时间:2019-06-12 14:07:12

标签: apache-kafka apache-kafka-streams

我注意到我的Kafka Streams应用程序在一段时间内无法与Kafka通信后将进入ERROR状态。我想找到一种使Kafka Streams本质上“永远重试”而不是进入ERROR状态的方法。唯一的解决方法是重新启动我的Kafka Streams应用程序,这不理想。

我在Kafka Streams配置中设置了request.timeout.ms=2147483647。我注意到这有帮助(它过去大约一分钟进入ERROR状态,现在它的发生频率降低了,但最终还是会发生)。

这是我的Kafka Streams配置:

 commit.interval.ms: 10000
 cache.max.bytes.buffering: 0
 retries: 2147483647
 request.timeout.ms: 2147483647
 retry.backoff.ms: 5000
 num.stream.threads: 1
 state.dir: /tmp/kafka-streams
 producer.batch.size: 102400
 producer.max.request.size: 31457280
 producer.buffer.memory: 314572800
 producer.max.in.flight.requests.per.connection: 10
 producer.linger.ms: 0
 consumer.max.partition.fetch.bytes: 31457280
 consumer.receive.buffer.bytes: 655360

这是来自Kafka Streams的日志的相关部分:

[2019-06-07T22:18:07,223Z {StreamThread-1} WARN  org.apache.kafka.clients.NetworkClient] [Consumer clientId=StreamThread-1-consumer, groupId=app-stream] 20 partitions have leader brokers without a matching listener, including [app-stream-tmp-store-changelog-5, app-stream-tmp-store-changelog-13, app-stream-tmp-store-changelog-9, app-stream-tmp-store-changelog-1, __consumer_offsets-10, __consumer_offsets-30, __consumer_offsets-18, __consumer_offsets-22, __consumer_offsets-34, __consumer_offsets-6]
[2019-06-07T22:18:08,662Z {StreamThread-1} ERROR org.apache.kafka.streams.processor.internals.AssignedStreamsTasks] stream-thread [StreamThread-1] Failed to commit stream task 0_14 due to the following error:
org.apache.kafka.common.errors.TimeoutException: Timeout of 60000ms expired before successfully committing offsets {global-14=OffsetAndMetadata{offset=33038702, leaderEpoch=null, metadata=''}}
[2019-06-07T22:18:08,662Z {StreamThread-1} ERROR org.apache.kafka.streams.processor.internals.StreamThread] stream-thread [StreamThread-1] Encountered the following unexpected Kafka exception during processing, this usually indicate Streams internal errors:
org.apache.kafka.common.errors.TimeoutException: Timeout of 60000ms expired before successfully committing offsets {global-2=OffsetAndMetadata{offset=25537237, leaderEpoch=null, metadata=''}}
[2019-06-07T22:18:08,662Z {StreamThread-1} INFO  org.apache.kafka.streams.processor.internals.StreamThread] stream-thread [StreamThread-1] State transition from RUNNING to PENDING_SHUTDOWN
[2019-06-07T22:18:08,662Z {StreamThread-1} INFO  org.apache.kafka.streams.processor.internals.StreamThread] stream-thread [StreamThread-1] Shutting down
[2019-06-07T22:18:08,704Z {StreamThread-1} INFO  org.apache.kafka.clients.consumer.KafkaConsumer] [Consumer clientId=StreamThread-1-restore-consumer, groupId=null] Unsubscribed all topics or patterns and assigned partitions
[2019-06-07T22:18:08,704Z {StreamThread-1} INFO  org.apache.kafka.clients.producer.KafkaProducer] [Producer clientId=StreamThread-1-producer] Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms.
[2019-06-07T22:18:08,728Z {StreamThread-1} INFO  org.apache.kafka.streams.processor.internals.StreamThread] stream-thread [StreamThread-1] State transition from PENDING_SHUTDOWN to DEAD
[2019-06-07T22:18:08,728Z {StreamThread-1} INFO  org.apache.kafka.streams.KafkaStreams] stream-client [usxapgutpd01-] State transition from RUNNING to ERROR
[2019-06-07T22:18:08,728Z {StreamThread-1} ERROR org.apache.kafka.streams.KafkaStreams] stream-client [usxapgutpd01-] All stream threads have died. The instance will be in error state and should be closed.
[2019-06-07T22:18:08,728Z {StreamThread-1} INFO  org.apache.kafka.streams.processor.internals.StreamThread] stream-thread [StreamThread-1] Shutdown complete

0 个答案:

没有答案