Kafka流应用无法获取分区

时间:2018-06-08 16:18:52

标签: apache-kafka-streams

我创建了一个包含3个代理的kafka集群,并提供了以下详细信息:

  1. 创建了3个主题,每个主题的复制因子= 3,分区= 2。
  2. 创建了2个制作人,每个人都写一个主题。
  3. 创建了一个Streams应用程序来处理来自2个主题的消息并写入第3个主题。
  4. 直到现在一直运行正常,但是在启动Streams应用程序时我突然开始收到以下警告:

    [WARN ] 2018-06-08 21:16:49.188 [Stream3-4f7403ad-aba6-4d34-885d-60114fc9fcff-StreamThread-1] org.apache.kafka.clients.consumer.internals.Fetcher [Consumer clientId=Stream3-4f7403ad-aba6-4d34-885d-60114fc9fcff-StreamThread-1-restore-consumer, groupId=] Attempt to fetch offsets for partition Stream3-KSTREAM-OUTEROTHER-0000000005-store-changelog-0 failed due to: Disk error when trying to access log file on the disk.
    

    由于此警告,Streams应用程序未处理2个主题中的任何内容。

    我尝试了以下事项:

    1. 停止所有代理,删除每个代理的kafka-logs目录并重新启动代理。它没有解决问题。
    2. 停止了zookeeper和所有经纪人,删除了每个经纪人的zookeeper日志以及kafka-logs,重新启动了zookeeper和经纪人,并再次创建了主题。这也没有解决问题。
    3. 我无法在官方文档或网络上找到与此错误相关的任何内容。有没有人知道为什么我突然收到这个错误?

      修改

      在3个经纪人中,2个经纪人(经纪人0和经纪人2)不断发出这些记录:

      Broker-0 log:

      [2018-06-09 02:03:08,750] INFO [ReplicaFetcher replicaId=0, leaderId=1, fetcherId=0] Retrying leaderEpoch request for partition initial11_topic-1 as the leader reported an error: NOT_LEADER_FOR_PARTITION (kafka.server.ReplicaFetcherThread)
      [2018-06-09 02:03:08,750] INFO [ReplicaFetcher replicaId=0, leaderId=1, fetcherId=0] Retrying leaderEpoch request for partition initial12_topic-0 as the leader reported an error: NOT_LEADER_FOR_PARTITION (kafka.server.ReplicaFetcherThread)
      

      经纪人2日志:

      [2018-06-09 02:04:46,889] INFO [ReplicaFetcher replicaId=2, leaderId=1, fetcherId=0] Retrying leaderEpoch request for partition initial11_topic-1 as the leader reported an error: NOT_LEADER_FOR_PARTITION (kafka.server.ReplicaFetcherThread)
      [2018-06-09 02:04:46,889] INFO [ReplicaFetcher replicaId=2, leaderId=1, fetcherId=0] Retrying leaderEpoch request for partition initial12_topic-0 as the leader reported an error: NOT_LEADER_FOR_PARTITION (kafka.server.ReplicaFetcherThread)
      

      Broker-1显示以下日志:

      [2018-06-09 01:21:26,689] INFO [GroupMetadataManager brokerId=1] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
      [2018-06-09 01:31:26,689] INFO [GroupMetadataManager brokerId=1] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
      [2018-06-09 01:39:44,667] ERROR [KafkaApi-1] Number of alive brokers '0' does not meet the required replication factor '1' for the offsets topic (configured via 'offsets.topic.replication.factor'). This error can be ignored if the cluster is starting up and not all brokers are up yet. (kafka.server.KafkaApis)
      [2018-06-09 01:41:26,689] INFO [GroupMetadataManager brokerId=1] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
      

      我再次阻止了zookeeper和经纪人,删除了他们的日志并重新启动。一旦我再次创建主题,我就开始获取上述日志。

      主题详情:

          [zk: localhost:2181(CONNECTED) 3] get /brokers/topics/initial11_topic
      {"version":1,"partitions":{"1":[1,0,2],"0":[0,2,1]}}
      cZxid = 0x53
      ctime = Sat Jun 09 01:25:42 EDT 2018
      mZxid = 0x53
      mtime = Sat Jun 09 01:25:42 EDT 2018
      pZxid = 0x54
      cversion = 1
      dataVersion = 0
      aclVersion = 0
      ephemeralOwner = 0x0
      dataLength = 52
      numChildren = 1
      [zk: localhost:2181(CONNECTED) 4] get /brokers/topics/initial12_topic
      {"version":1,"partitions":{"1":[2,1,0],"0":[1,0,2]}}
      cZxid = 0x61
      ctime = Sat Jun 09 01:25:47 EDT 2018
      mZxid = 0x61
      mtime = Sat Jun 09 01:25:47 EDT 2018
      pZxid = 0x62
      cversion = 1
      dataVersion = 0
      aclVersion = 0
      ephemeralOwner = 0x0
      dataLength = 52
      numChildren = 1
      [zk: localhost:2181(CONNECTED) 5] get /brokers/topics/final11_topic
      {"version":1,"partitions":{"1":[0,1,2],"0":[2,0,1]}}
      cZxid = 0x48
      ctime = Sat Jun 09 01:25:32 EDT 2018
      mZxid = 0x48
      mtime = Sat Jun 09 01:25:32 EDT 2018
      pZxid = 0x4a
      cversion = 1
      dataVersion = 0
      aclVersion = 0
      ephemeralOwner = 0x0
      dataLength = 52
      numChildren = 1
      

      有任何线索吗?

1 个答案:

答案 0 :(得分:1)

我发现了这个问题。这是因为在broker-1的server.properties中跟随了错误的配置:

advertised.listeners=PLAINTEXT://10.23.152.109:9094

对于advertised.listeners,错误地将端口更改为与broker-2的advertised.listeners端口相同。