我有三台服务器:
blade1(192.168.112.31),
blade2(192.168.112.32)和
blade3(192.168.112.33)。
在每台服务器上安装kafka_2.11-1.0.0 在刀片3(192.168.112.33:2181)上也安装了zookeeper。
我创建了一个主题repl3part5,其中包含以下行:
bin/kafka-topics.sh --zookeeper 192.168.112.33:2181 --create --replication-factor 3 --partitions 5 --topic repl3part5
当我描述这个主题时,它看起来像这样:
[root@blade1 kafka]# bin/kafka-topics.sh --describe --topic repl3part5 --zookeeper 192.168.112.33:2181
Topic:repl3part5 PartitionCount:5 ReplicationFactor:3 Configs:
Topic: repl3part5 Partition: 0 Leader: 2 Replicas: 2,3,1 Isr: 2,3,1
Topic: repl3part5 Partition: 1 Leader: 3 Replicas: 3,1,2 Isr: 3,1,2
Topic: repl3part5 Partition: 2 Leader: 1 Replicas: 1,2,3 Isr: 1,2,3
Topic: repl3part5 Partition: 3 Leader: 2 Replicas: 2,1,3 Isr: 2,1,3
Topic: repl3part5 Partition: 4 Leader: 3 Replicas: 3,2,1 Isr: 3,2,1
我有一个关于这个主题的制作人:
bin/kafka-console-producer.sh --broker-list 192.168.112.31:9092,192.168.112.32:9092,192.168.112.33:9092 --topic repl3part5
和单一消费者:
bin/kafka-console-consumer.sh --bootstrap-server 192.168.112.31:9092,192.168.112.32:9092,192.168.112.33:9092 --topic repl3part5 --consumer-property group.id=zoran_1
生产者发送的每条消息都由消费者收集。到目前为止 - 太好了。
现在我想测试kafka服务器的故障转移。如果我放下刀片3 kafka服务,我会收到消费者警告,但所有生成的消息仍然消耗。
[2018-01-30 14:30:01,203] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 3 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:30:01,299] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 3 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:30:01,475] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 3 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
现在我已经在刀片3上启动了kafka服务,我已经在刀片2服务器上放下了kafka服务。 消费者现在显示了一个警告,但仍然消耗了所有生成的消息。
[2018-01-30 14:31:38,164] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 2 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
现在我已经在刀片2上启动了kafka服务,并且我已经在刀片1服务器上放下了kafka服务。
消费者现在显示有关节点1/2147483646的警告,但也显示偏移的异步自动提交...失败:偏移提交失败并带有可重试的异常。您应该重试提交偏移量。潜在的错误是:null。
[2018-01-30 14:33:16,393] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:16,469] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 2147483646 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:16,557] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:16,986] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 2147483646 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:16,991] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:17,493] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 2147483646 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:17,495] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:18,002] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 2147483646 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:18,003] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Asynchronous auto-commit of offsets {repl3part5-4=OffsetAndMetadata{offset=18, metadata=''}, repl3part5-3=OffsetAndMetadata{offset=20, metadata=''}, repl3part5-2=OffsetAndMetadata{offset=19, metadata=''}, repl3part5-1=OffsetAndMetadata{offset=20, metadata=''}, repl3part5-0=OffsetAndMetadata{offset=20, metadata=''}} failed: Offset commit failed with a retriable exception. You should retry committing offsets. The underlying error was: null (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2018-01-30 14:33:18,611] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:18,932] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 2147483646 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:18,933] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Asynchronous auto-commit of offsets {repl3part5-4=OffsetAndMetadata{offset=18, metadata=''}, repl3part5-3=OffsetAndMetadata{offset=20, metadata=''}, repl3part5-2=OffsetAndMetadata{offset=19, metadata=''}, repl3part5-1=OffsetAndMetadata{offset=20, metadata=''}, repl3part5-0=OffsetAndMetadata{offset=20, metadata=''}} failed: Offset commit failed with a retriable exception. You should retry committing offsets. The underlying error was: null (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2018-01-30 14:33:19,977] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 2147483646 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:19,978] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Asynchronous auto-commit of offsets {repl3part5-4=OffsetAndMetadata{offset=18, metadata=''}, repl3part5-3=OffsetAndMetadata{offset=20, metadata=''}, repl3part5-2=OffsetAndMetadata{offset=19, metadata=''}, repl3part5-1=OffsetAndMetadata{offset=20, metadata=''}, repl3part5-0=OffsetAndMetadata{offset=20, metadata=''}} failed: Offset commit failed with a retriable exception. You should retry committing offsets. The underlying error was: null (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2018-01-30 14:33:19,979] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
我试图通过在所有三个server.properties文件(其中一个在这里https://pastebin.com/Japn0Grk)上添加一个offsets.topic.replication.factor = 2(或3)来解决问题,但没有成功。 我的想法是主题__consumer_offset没有在整个集群中复制,但看起来并非如此。
虽然刀片1 kafka服务已关闭主题描述显示如下:
[root@blade1 kafka]# bin/kafka-topics.sh --describe --topic repl3part5 --zookeeper 192.168.112.33:2181
Topic:repl3part5 PartitionCount:5 ReplicationFactor:3 Configs:
Topic: repl3part5 Partition: 0 Leader: 3 Replicas: 2,3,1 Isr: 3
Topic: repl3part5 Partition: 1 Leader: 3 Replicas: 3,1,2 Isr: 3
Topic: repl3part5 Partition: 2 Leader: 3 Replicas: 1,2,3 Isr: 3
Topic: repl3part5 Partition: 3 Leader: 3 Replicas: 2,1,3 Isr: 3
Topic: repl3part5 Partition: 4 Leader: 3 Replicas: 3,2,1 Isr: 3
生产者现在显示以下警告,它仍会在主题上放置消息,但消息只会增加分区上的延迟计数:
[2018-01-30 14:37:21,816] WARN [Producer clientId=console-producer] Connection to node 1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
我注意到虽然blade1上的kafka服务还活着,但我可以将任意组合中的刀片2和3放下来,消费者将始终能够使用消息。 如果刀片1上的kafka服务已关闭,那么即使刀片2和刀片3上的kafka服务已启动并运行,消费者也无法使用消息。
在刀片1上启用kafka服务后,生产者在刀片1上的kafka服务关闭时发送的所有消息都被重放,并且消费者终端显示以下内容:
[2018-01-30 14:44:30,817] ERROR [Consumer clientId=consumer-1, groupId=zoran_1] Offset commit failed on partition repl3part5-4 at offset 20: This is not the correct coordinator. (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2018-01-30 14:44:30,817] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Asynchronous auto-commit of offsets {repl3part5-4=OffsetAndMetadata{offset=20, metadata=''}, repl3part5-3=OffsetAndMetadata{offset=22, metadata=''}, repl3part5-2=OffsetAndMetadata{offset=20, metadata=''}, repl3part5-1=OffsetAndMetadata{offset=22, metadata=''}, repl3part5-0=OffsetAndMetadata{offset=22, metadata=''}} failed: Offset commit failed with a retriable exception. You should retry committing offsets. The underlying error was: This is not the correct coordinator. (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2018-01-30 14:44:31,202] ERROR [Consumer clientId=consumer-1, groupId=zoran_1] Offset commit failed on partition repl3part5-4 at offset 22: This is not the correct coordinator. (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2018-01-30 14:44:31,202] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Asynchronous auto-commit of offsets {repl3part5-4=OffsetAndMetadata{offset=22, metadata=''}, repl3part5-3=OffsetAndMetadata{offset=24, metadata=''}, repl3part5-2=OffsetAndMetadata{offset=22, metadata=''}, repl3part5-1=OffsetAndMetadata{offset=24, metadata=''}, repl3part5-0=OffsetAndMetadata{offset=24, metadata=''}} failed: Offset commit failed with a retriable exception. You should retry committing offsets. The underlying error was: This is not the correct coordinator. (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
从现在开始,一切都没有问题或警告,系统功能齐全。
有人可以向我解释为什么刀片1上的kafka服务器如此重要,为了能够阻止两个服务器中的任何一个(包括刀片1上的kafka服务器)并且能够使用消息,我有什么选择没有延迟? 这件事让我发疯。
你能帮忙吗?
的问候。