我的kafka流应用程序抛出以下异常并死亡,原因是kafka集群用完了磁盘空间而死了。群集现在已经启动并正常运行,但是由于相同的原因,流应用仍然会死掉。
我们正在使用kafka-streams-2.0.1和2.1.1集群。
编辑1 :我在kafka中看到相应的changelog主题,其中包含状态存储备份,但是流应用的本地状态存储看起来是空的。
编辑3 :“一段时间”后,流应用成功启动。
我可以看到(集群日志):
INFO [GroupCoordinator 1]: Member arfc-app.article.fulfillment.snapshot-f2f474f5-2a40-4b86-b555-a4815433a6ec-StreamThread-392-consumer-e673ecac-b782-4597-af6d-d47cb190
aedb in group arfc-app.article.fulfillment.snapshot has failed, removing it from the group (kafka.coordinator.group.GroupCoordinator)
INFO [GroupCoordinator 1]: Group arfc-app.article.fulfillment.snapshot with generation 2080 is now empty (__consumer_offsets-20) (kafka.coordinator.group.GroupCoordina
tor)
有什么想法吗?从高级别的角度来看,我理解由于“某些原因”(与集群崩溃有关),流应用程序使用者无法成功地重新平衡并不断消亡,因此我正在重新启动它。但是为什么它最终成功了?它超时了吗?
很多!
流应用程序日志:
org.apache.kafka.streams.errors.StreamsException: stream-thread [arfc-app.article.fulfillment.snapshot-2f5894e5-ea94-40bd-b5e6-533a6f631368-StreamThread-44] Failed to rebalance.
at org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:870) ~[kafka-streams-2.0.1.jar!/:na]
at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:814) ~[kafka-streams-2.0.1.jar!/:na]
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:767) ~[kafka-streams-2.0.1.jar!/:na]
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:736) ~[kafka-streams-2.0.1.jar!/:na]
Caused by: org.apache.kafka.common.KafkaException: Unexpected error in InitProducerIdResponse; The server experienced an unexpected error when processing the request
at org.apache.kafka.clients.producer.internals.TransactionManager$InitProducerIdHandler.handleResponse(TransactionManager.java:982) ~[kafka-clients-2.0.1.jar!/:na]
at org.apache.kafka.clients.producer.internals.TransactionManager$TxnRequestHandler.onComplete(TransactionManager.java:907) ~[kafka-clients-2.0.1.jar!/:na]
at org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:109) ~[kafka-clients-2.0.1.jar!/:na]
at org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:532) ~[kafka-clients-2.0.1.jar!/:na]
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:524) ~[kafka-clients-2.0.1.jar!/:na]
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:216) ~[kafka-clients-2.0.1.jar!/:na]
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:163) ~[kafka-clients-2.0.1.jar!/:na]
at java.base/java.lang.Thread.run(Thread.java:844) ~[na:na]
编辑2:看起来相关的群集日志:
INFO [GroupCoordinator 1]: Preparing to rebalance group arfc-app.article.fulfillment.snapshot in state PreparingRebalance with old generation 1980 (__consumer_offsets-20) (reason: Adding new member arfc-app.article.fulfillment.snapshot-2c76bc1d-7084-4903-86cc-054cbbe92b7c-StreamThread-343-consumer-c59f7f13-b44e-446a-9af3-da3adbc91cc5) (kafka.coordinator.group.GroupCoordinator)
INFO [GroupCoordinator 1]: Stabilized group arfc-app.article.fulfillment.snapshot generation 1981 (__consumer_offsets-20) (kafka.coordinator.group.GroupCoordinator)
INFO [GroupCoordinator 1]: Assignment received from leader for group arfc-app.article.fulfillment.snapshot for generation 1981 (kafka.coordinator.group.GroupCoordinator)
INFO [GroupCoordinator 1]: Member arfc-app.article.fulfillment.snapshot-b621cd59-8eb4-4bec-bd83-68a1ffa6a126-StreamThread-316-consumer-ff4b1db4-de87-4978-94d4-c0bda96a1626 in group arfc-app.article.fulfillment.snapshot has failed, removing it from the group (kafka.coordinator.group.GroupCoordinator)
INFO [GroupCoordinator 1]: Preparing to rebalance group arfc-app.article.fulfillment.snapshot in state PreparingRebalance with old generation 1981 (__consumer_offsets-20) (reason: removing member arfc-app.article.fulfillment.snapshot-b621cd59-8eb4-4bec-bd83-68a1ffa6a126-StreamThread-316-consumer-ff4b1db4-de87-4978-94d4-c0bda96a1626 on heartbeat expiration) (kafka.coordinator.group.GroupCoordinator)
INFO [GroupCoordinator 1]: Member arfc-app.article.fulfillment.snapshot-2c76bc1d-7084-4903-86cc-054cbbe92b7c-StreamThread-343-consumer-c59f7f13-b44e-446a-9af3-da3adbc91cc5 in group arfc-app.article.fulfillment.snapshot has failed, removing it from the group (kafka.coordinator.group.GroupCoordinator)