如果测试群集中特定节点上的kafka服务已关闭,则组消费者无法使用消息

时间:2018-01-30 14:38:32

标签: apache-kafka apache-zookeeper kafka-consumer-api kafka-producer-api

我有三台服务器:

blade1(192.168.112.31),
blade2(192.168.112.32)和
blade3(192.168.112.33)。

在每台服务器上安装kafka_2.11-1.0.0 在刀片3(192.168.112.33:2181)上也安装了zookeeper。

我创建了一个主题repl3part5,其中包含以下行:

bin/kafka-topics.sh --zookeeper 192.168.112.33:2181 --create --replication-factor 3 --partitions 5 --topic repl3part5

当我描述这个主题时,它看起来像这样:

[root@blade1 kafka]# bin/kafka-topics.sh --describe --topic repl3part5 --zookeeper 192.168.112.33:2181

Topic:repl3part5    PartitionCount:5    ReplicationFactor:3 Configs:
    Topic: repl3part5    Partition: 0    Leader: 2    Replicas: 2,3,1    Isr: 2,3,1
    Topic: repl3part5    Partition: 1    Leader: 3    Replicas: 3,1,2    Isr: 3,1,2
    Topic: repl3part5    Partition: 2    Leader: 1    Replicas: 1,2,3    Isr: 1,2,3
    Topic: repl3part5    Partition: 3    Leader: 2    Replicas: 2,1,3    Isr: 2,1,3
    Topic: repl3part5    Partition: 4    Leader: 3    Replicas: 3,2,1    Isr: 3,2,1

我有一个关于这个主题的制作人:

bin/kafka-console-producer.sh --broker-list 192.168.112.31:9092,192.168.112.32:9092,192.168.112.33:9092 --topic repl3part5

和单一消费者:

bin/kafka-console-consumer.sh --bootstrap-server 192.168.112.31:9092,192.168.112.32:9092,192.168.112.33:9092 --topic repl3part5  --consumer-property group.id=zoran_1

生产者发送的每条消息都由消费者收集。到目前为止 - 太好了。

现在我想测试kafka服务器的故障转移。如果我放下刀片3 kafka服务,我会收到消费者警告,但所有生成的消息仍然消耗。

[2018-01-30 14:30:01,203] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 3 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:30:01,299] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 3 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:30:01,475] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 3 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)

现在我已经在刀片3上启动了kafka服务,我已经在刀片2服务器上放下了kafka服务。 消费者现在显示了一个警告,但仍然消耗了所有生成的消息。

[2018-01-30 14:31:38,164] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 2 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)

现在我已经在刀片2上启动了kafka服务,并且我已经在刀片1服务器上放下了kafka服务。

消费者现在显示有关节点1/2147483646的警告,但也显示偏移的异步自动提交...失败:偏移提交失败并带有可重试的异常。您应该重试提交偏移量。潜在的错误是:null。

[2018-01-30 14:33:16,393] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:16,469] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 2147483646 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:16,557] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:16,986] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 2147483646 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:16,991] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:17,493] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 2147483646 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:17,495] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:18,002] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 2147483646 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:18,003] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Asynchronous auto-commit of offsets {repl3part5-4=OffsetAndMetadata{offset=18, metadata=''}, repl3part5-3=OffsetAndMetadata{offset=20, metadata=''}, repl3part5-2=OffsetAndMetadata{offset=19, metadata=''}, repl3part5-1=OffsetAndMetadata{offset=20, metadata=''}, repl3part5-0=OffsetAndMetadata{offset=20, metadata=''}} failed: Offset commit failed with a retriable exception. You should retry committing offsets. The underlying error was: null (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2018-01-30 14:33:18,611] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:18,932] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 2147483646 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:18,933] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Asynchronous auto-commit of offsets {repl3part5-4=OffsetAndMetadata{offset=18, metadata=''}, repl3part5-3=OffsetAndMetadata{offset=20, metadata=''}, repl3part5-2=OffsetAndMetadata{offset=19, metadata=''}, repl3part5-1=OffsetAndMetadata{offset=20, metadata=''}, repl3part5-0=OffsetAndMetadata{offset=20, metadata=''}} failed: Offset commit failed with a retriable exception. You should retry committing offsets. The underlying error was: null (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2018-01-30 14:33:19,977] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 2147483646 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:19,978] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Asynchronous auto-commit of offsets {repl3part5-4=OffsetAndMetadata{offset=18, metadata=''}, repl3part5-3=OffsetAndMetadata{offset=20, metadata=''}, repl3part5-2=OffsetAndMetadata{offset=19, metadata=''}, repl3part5-1=OffsetAndMetadata{offset=20, metadata=''}, repl3part5-0=OffsetAndMetadata{offset=20, metadata=''}} failed: Offset commit failed with a retriable exception. You should retry committing offsets. The underlying error was: null (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2018-01-30 14:33:19,979] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)

我试图通过在所有三个server.properties文件(其中一个在这里https://pastebin.com/Japn0Grk)上添加一个offsets.topic.replication.factor = 2(或3)来解决问题,但没有成功。 我的想法是主题__consumer_offset没有在整个集群中复制,但看起来并非如此。

虽然刀片1 kafka服务已关闭主题描述显示如下:

[root@blade1 kafka]# bin/kafka-topics.sh --describe --topic repl3part5 --zookeeper 192.168.112.33:2181

Topic:repl3part5    PartitionCount:5    ReplicationFactor:3 Configs:
    Topic: repl3part5    Partition: 0    Leader: 3    Replicas: 2,3,1    Isr: 3
    Topic: repl3part5    Partition: 1    Leader: 3    Replicas: 3,1,2    Isr: 3
    Topic: repl3part5    Partition: 2    Leader: 3    Replicas: 1,2,3    Isr: 3
    Topic: repl3part5    Partition: 3    Leader: 3    Replicas: 2,1,3    Isr: 3
    Topic: repl3part5    Partition: 4    Leader: 3    Replicas: 3,2,1    Isr: 3

生产者现在显示以下警告,它仍会在主题上放置消息,但消息只会增加分区上的延迟计数:

[2018-01-30 14:37:21,816] WARN [Producer clientId=console-producer] Connection to node 1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)

我注意到虽然blade1上的kafka服务还活着,但我可以将任意组合中的刀片2和3放下来,消费者将始终能够使用消息。 如果刀片1上的kafka服务已关闭,那么即使刀片2和刀片3上的kafka服务已启动并运行,消费者也无法使用消息。

在刀片1上启用kafka服务后,生产者在刀片1上的kafka服务关闭时发送的所有消息都被重放,并且消费者终端显示以下内容:

[2018-01-30 14:44:30,817] ERROR [Consumer clientId=consumer-1, groupId=zoran_1] Offset commit failed on partition repl3part5-4 at offset 20: This is not the correct coordinator. (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2018-01-30 14:44:30,817] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Asynchronous auto-commit of offsets {repl3part5-4=OffsetAndMetadata{offset=20, metadata=''}, repl3part5-3=OffsetAndMetadata{offset=22, metadata=''}, repl3part5-2=OffsetAndMetadata{offset=20, metadata=''}, repl3part5-1=OffsetAndMetadata{offset=22, metadata=''}, repl3part5-0=OffsetAndMetadata{offset=22, metadata=''}} failed: Offset commit failed with a retriable exception. You should retry committing offsets. The underlying error was: This is not the correct coordinator. (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2018-01-30 14:44:31,202] ERROR [Consumer clientId=consumer-1, groupId=zoran_1] Offset commit failed on partition repl3part5-4 at offset 22: This is not the correct coordinator. (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2018-01-30 14:44:31,202] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Asynchronous auto-commit of offsets {repl3part5-4=OffsetAndMetadata{offset=22, metadata=''}, repl3part5-3=OffsetAndMetadata{offset=24, metadata=''}, repl3part5-2=OffsetAndMetadata{offset=22, metadata=''}, repl3part5-1=OffsetAndMetadata{offset=24, metadata=''}, repl3part5-0=OffsetAndMetadata{offset=24, metadata=''}} failed: Offset commit failed with a retriable exception. You should retry committing offsets. The underlying error was: This is not the correct coordinator. (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)

从现在开始,一切都没有问题或警告,系统功能齐全。

有人可以向我解释为什么刀片1上的kafka服务器如此重要,为了能够阻止两个服务器中的任何一个(包括刀片1上的kafka服务器)并且能够使用消息,我有什么选择没有延迟? 这件事让我发疯。

你能帮忙吗?

的问候。

0 个答案:

没有答案