卡夫卡高可用功能不起作用

时间:2017-06-23 09:53:44

标签: apache-kafka

我正在尝试kafka文档的快速入门,链接是https://kafka.apache.org/quickstart。 我已经部署了3个代理并创建了一个主题。

➜  kafka_2.10-0.10.1.0 bin/kafka-topics.sh --describe --zookeeper 
    localhost:2181 --topic my-replicated-topic
    Topic:my-replicated-topic   PartitionCount:1    ReplicationFactor:3 
    Configs:
    Topic: my-replicated-topic  Partition: 0    Leader: 2   Replicas: 2,0,1 
    Isr: 2,1,0

然后我使用“bin / kafka-console-producer.sh --broker-list localhost:9092 --topic my-replicated-topic”来测试生产者。 并使用“bin / kafka-console-consumer.sh --bootstrap-server localhost:9092 - from-beginning --topic my-replicated-topic来测试消费者” 生产者和消费者工作得很好。 如果我杀死服务器1或2,生产者和消费者就能正常工作。

但如果我杀死服务器0,并在生产者终端中输入消息,则消费者无法读取新消息。 当我杀死服务器0时,消费者打印日志:

[2017-06-23 17:29:52,750] WARN Auto offset commit failed for group console-consumer-97540: Offset commit failed with a retriable exception. You should 
retry committing offsets. 
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2017-06-23 17:29:52,974] WARN Auto offset commit failed for group console-
consumer-97540: Offset commit failed with a retriable exception. You should 
retry committing offsets. 
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2017-06-23 17:29:53,085] WARN Auto offset commit failed for group console-
consumer-97540: Offset commit failed with a retriable exception. You should 
retry committing offsets. 
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2017-06-23 17:29:53,195] WARN Auto offset commit failed for group console-
consumer-97540: Offset commit failed with a retriable exception. You should 
retry committing offsets. 
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2017-06-23 17:29:53,302] WARN Auto offset commit failed for group console-
consumer-97540: Offset commit failed with a retriable exception. You should 
retry committing offsets. 
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2017-06-23 17:29:53,409] WARN Auto offset commit failed for group console-
consumer-97540: Offset commit failed with a retriable exception. You should 
retry committing offsets. 
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)

然后我重启服务器0,消费者打印消息和一些警告日志:

hhhh
hello
[2017-06-23 17:32:32,795] WARN Auto offset commit failed for group console-
consumer-97540: Offset commit failed with a retriable exception. You should 
retry committing offsets. 
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2017-06-23 17:32:32,902] WARN Auto offset commit failed for group console-
consumer-97540: Offset commit failed with a retriable exception. You should 
retry committing offsets. 
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)

这让我很困惑。为什么服务器0如此特殊,服务器0不是领导者。

我注意到服务器0打印的服务器日志有很多信息如下:

[2017-06-23 17:32:33,640] INFO [Group Metadata Manager on Broker 0]: Finished 
loading offsets from [__consumer_offsets,23] in 38 milliseconds. 
(kafka.coordinator.GroupMetadataManager)
[2017-06-23 17:32:33,641] INFO [Group Metadata Manager on Broker 0]: Loading 
offsets and group metadata from [__consumer_offsets,26] 
(kafka.coordinator.GroupMetadataManager)
[2017-06-23 17:32:33,646] INFO [Group Metadata Manager on Broker 0]: Finished 
loading offsets from [__consumer_offsets,26] in 4 milliseconds. 
(kafka.coordinator.GroupMetadataManager)
[2017-06-23 17:32:33,646] INFO [Group Metadata Manager on Broker 0]: Loading 
offsets and group metadata from [__consumer_offsets,29] 
(kafka.coordinator.GroupMetadataManager)

但是server1和serve2日志没有该内容。

有人可以帮我解释一下,非常感谢!

解决:  _consumer-offsets主题上的复制因子是根本原因。这是一个问题:issues.apache.org/jira/browse/KAFKA-3959

3 个答案:

答案 0 :(得分:0)

服务器共享负载以管理消费者组。

通常,每个独立消费者都有一个唯一的消费者组ID,并且当您想要在多个消费者之间拆分消费过程时,您使用相同的组ID。

话虽这么说:作为集群中的Kafka服务器的领导经纪人,只是为了协调其他经纪人。领导者(直接)与当前正在管理组ID并为特定消费者提交的服务器无关(

因此,无论何时订阅,您都被指定为服务器,该服务器将处理您的组的偏移提交,这与领导者选举无关。

关闭该服务器,您可能会遇到群组消耗问题,直到Kafka群集再次稳定为止(重新分配您的消费者以将群组管理移动到其他服务器或等待节点再次响应...我不够专业从那里告诉你究竟如何发生故障转移)。

答案 1 :(得分:0)

kafka-console-producer默认为acks = 1,因此根本不具备容错能力。添加flag或config参数以设置acks = all,如果您的主题和_consumer-offsets主题都是使用复制因子3创建的,那么您的测试将起作用。

答案 2 :(得分:0)

主题__consumer_offsets可能将“ Replicas”设置为0。 要确认这一点,请验证主题__consumer_offsets:

kafka-topics.sh --bootstrap-server localhost:9092-描述--topic __consumer_offsets

Topic: __consumer_offsets   PartitionCount: 50  ReplicationFactor: 1    Configs: compression.type=producer,cleanup.policy=compact,segment.bytes=104857600
Topic: __consumer_offsets   Partition: 0    Leader: 0   Replicas: 0 Isr: 0
Topic: __consumer_offsets   Partition: 1    Leader: 0   Replicas: 0 Isr: 0
Topic: __consumer_offsets   Partition: 2    Leader: 0   Replicas: 0 Isr: 0
Topic: __consumer_offsets   Partition: 3    Leader: 0   Replicas: 0 Isr: 0
Topic: __consumer_offsets   Partition: 4    Leader: 0   Replicas: 0 Isr: 0
...
Topic: __consumer_offsets   Partition: 49   Leader: 0   Replicas: 0 Isr: 0

注意“副本:0 Isr:0”。这就是当您停止代理0时,使用者不再收到消息的原因。

要解决此问题,您需要更改主题__consumer_offsets的“ Replicas”,包括其他代理。

  1. 创建一个像这样的json文件(config / inc-replication-factor-consumer_offsets.json):
{"version":1,
 "partitions":[
   {"topic":"__consumer_offsets", "partition":0,  "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":1,  "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":2,  "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":3,  "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":4,  "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":5,  "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":6,  "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":7,  "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":8,  "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":9,  "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":10, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":11, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":12, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":13, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":14, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":15, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":16, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":17, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":18, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":19, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":20, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":21, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":22, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":23, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":24, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":25, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":26, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":27, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":28, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":29, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":30, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":31, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":32, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":33, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":34, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":35, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":36, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":37, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":38, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":39, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":40, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":41, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":42, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":43, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":44, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":45, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":46, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":47, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":48, "replicas":[0, 1, 2]},
   {"topic":"__consumer_offsets", "partition":49, "replicas":[0, 1, 2]}
 ]
}
  1. 执行以下命令:

kafka-reassign-partitions.sh --bootstrap-server localhost:9092 --zookeeper localhost:2181 --reassignment-json-file ../config/inc-replication-factor-consumer_offsets.json --execute

  1. 确认“ Replicas”:

kafka-topics.sh --bootstrap-server localhost:9092-描述--topic __consumer_offsets

Topic: __consumer_offsets   PartitionCount: 50  ReplicationFactor: 3    Configs: compression.type=producer,cleanup.policy=compact,segment.bytes=104857600
Topic: __consumer_offsets   Partition: 0    Leader: 0   Replicas: 0,1,2 Isr: 0,2,1
Topic: __consumer_offsets   Partition: 1    Leader: 0   Replicas: 0,1,2 Isr: 0,2,1
Topic: __consumer_offsets   Partition: 2    Leader: 0   Replicas: 0,1,2 Isr: 0,2,1
Topic: __consumer_offsets   Partition: 3    Leader: 0   Replicas: 0,1,2 Isr: 0,2,1
...
Topic: __consumer_offsets   Partition: 49   Leader: 0   Replicas: 0,1,2 Isr: 0,2,1
  1. 现在您只能停止代理0,生成一些消息并在使用方上查看结果。