我有一个Kafka集群,其中有3个经纪人和3个动物园管理员。基于Kafka server.log,由于某些未知原因,它间歇性地遇到管道破裂错误。在断线错误消失之后,卡夫卡经纪人决定不加入集群,而不再加入集群,而成为自己的领导者(因为它将ISR从3缩减为1)。
到目前为止,唯一的解决方法是重新启动代理,它通常将作为跟随者重新加入集群。但是,每次出现类似问题时,我们都无法继续手动重启。
[2019-05-10 10:32:48,344] WARN Failed to send SSL Close message (org.apache.kafka.common.network.SslTransportLayer)
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
at org.apache.kafka.common.network.SslTransportLayer.flush(SslTransportLayer.java:209)
at org.apache.kafka.common.network.SslTransportLayer.close(SslTransportLayer.java:172)
at org.apache.kafka.common.utils.Utils.closeAll(Utils.java:718)
at org.apache.kafka.common.network.KafkaChannel.close(KafkaChannel.java:61)
at org.apache.kafka.common.network.Selector.doClose(Selector.java:746)
at org.apache.kafka.common.network.Selector.close(Selector.java:734)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:532)
at org.apache.kafka.common.network.Selector.poll(Selector.java:424)
at kafka.network.Processor.poll(SocketServer.scala:628)
at kafka.network.Processor.run(SocketServer.scala:545)
at java.lang.Thread.run(Thread.java:745)
[2019-05-10 10:32:48,368] WARN Failed to send SSL Close message (org.apache.kafka.common.network.SslTransportLayer)
java.io.IOException: Broken pipe
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
at org.apache.kafka.common.network.SslTransportLayer.flush(SslTransportLayer.java:209)
at org.apache.kafka.common.network.SslTransportLayer.close(SslTransportLayer.java:159)
at org.apache.kafka.common.utils.Utils.closeAll(Utils.java:718)
at org.apache.kafka.common.network.KafkaChannel.close(KafkaChannel.java:61)
at org.apache.kafka.common.network.Selector.doClose(Selector.java:746)
at org.apache.kafka.common.network.Selector.close(Selector.java:734)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:532)
at org.apache.kafka.common.network.Selector.poll(Selector.java:424)
at kafka.network.Processor.poll(SocketServer.scala:628)
at kafka.network.Processor.run(SocketServer.scala:545)
at java.lang.Thread.run(Thread.java:745)
[2019-05-10 10:32:53,422] WARN Failed to send SSL Close message (org.apache.kafka.common.network.SslTransportLayer)
java.io.IOException: Broken pipe
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
at org.apache.kafka.common.network.SslTransportLayer.flush(SslTransportLayer.java:209)
at org.apache.kafka.common.network.SslTransportLayer.close(SslTransportLayer.java:159)
at org.apache.kafka.common.utils.Utils.closeAll(Utils.java:718)
at org.apache.kafka.common.network.KafkaChannel.close(KafkaChannel.java:61)
at org.apache.kafka.common.network.Selector.doClose(Selector.java:746)
at org.apache.kafka.common.network.Selector.close(Selector.java:734)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:532)
at org.apache.kafka.common.network.Selector.poll(Selector.java:424)
at kafka.network.Processor.poll(SocketServer.scala:628)
at kafka.network.Processor.run(SocketServer.scala:545)
at java.lang.Thread.run(Thread.java:745)
[2019-05-10 10:32:56,976] INFO [Partition CS_NL_CUSTOMER_ADD-1 broker=4] Shrinking ISR from 4,6 to 4 (kafka.cluster.Partition)
[2019-05-10 10:32:56,994] INFO [Partition CS_NL_CUSTOMER_ADD-1 broker=4] Cached zkVersion [24394] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition)
[2019-05-10 10:32:56,994] INFO [Partition _confluent-controlcenter-5-1-0-1-MetricsAggregateStore-changelog-3 broker=4] Shrinking ISR from 4,6 to 4 (kafka.cluster.Partition)
[2019-05-10 10:32:57,023] INFO [Partition _confluent-controlcenter-5-1-0-1-MetricsAggregateStore-changelog-3 broker=4] Cached zkVersion [3724] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition)
[2019-05-10 10:32:57,023] INFO [Partition TEST_3_PART-2 broker=4] Shrinking ISR from 4,6 to 4 (kafka.cluster.Partition)
[2019-05-10 10:32:57,033] INFO [Partition TEST_3_PART-2 broker=4] Cached zkVersion [3300] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition)
据我对Kafka的了解,Kafka经纪人是否应该尽可能地重新加入集群?管道破裂后为什么没有发生这种情况? 还有一件事,是什么原因导致断管错误?是网络问题吗?
答案 0 :(得分:0)
证书似乎有问题。您需要为经纪人签发签名证书,并将其添加到客户的密钥库中。
Confluent的有关如何配置Encryption and Authentication with SSL的指南应该使事情更加清楚。