Kafka:从ZooKeeper获取经纪人主持人

时间:2015-04-07 11:08:03

标签: apache-zookeeper apache-kafka

出于特殊原因,我需要同时使用 - ConsumerGroup(a.k.a。高级消费者)和SimpleConsumer(a.k.a。低级消费者)来读取Kafka。对于ConsumerGroup我使用基于ZooKeeper的配置并且对它完全满意,但是SimpleConsumer需要实例化种子代理。

我不想保留两者的列表 - ZooKeeper和代理主机。因此,我正在寻找一种方法来自动发现经纪人,以获取来自ZooKeeper 的特定主题

由于某些间接信息,我相信这些数据存储在ZooKeeper中,位于以下路径之一:

  • /brokers/topics/<topic>/partitions/<partition-id>/state
  • /中间商/ IDS /

但是,当我尝试从这些节点读取数据时,我收到序列化错误(我使用com.101tec.zkclient为此):

  

org.I0Itec.zkclient.exception.ZkMarshallingError:java.io.StreamCorruptedException:无效的流标题:7B226A6D     at org.I0Itec.zkclient.serialize.SerializableSerializer.deserialize(SerializableSerializer.java:37)     在org.I0Itec.zkclient.ZkClient.derializable(ZkClient.java:740)     在org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:773)     在org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)     在org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:750)     在org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:744)     ... 64岁   引起:java.io.StreamCorruptedException:无效的流标题:7B226A6D     at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:804)     在java.io.ObjectInputStream。(ObjectInputStream.java:299)     at org.I0Itec.zkclient.serialize.TcclAwareObjectIputStream。(TcclAwareObjectIputStream.java:30)     at org.I0Itec.zkclient.serialize.SerializableSerializer.deserialize(SerializableSerializer.java:31)     ... 69更多

我可以毫无问题地编写和读取自定义Java对象(例如字符串),因此我认为它不是客户端的问题,而是棘手的编码。因此,我想知道:

  1. 如果这是正确的方法,如何正确阅读这些节点
  2. 如果整个方法都错了,什么是正确的

5 个答案:

答案 0 :(得分:30)

这就是我的一位同事为获得卡夫卡经纪人名单所采取的方式。我想当你想要动态获取经纪人名单时,这是正确的方法。

这是一个示例代码,显示如何获取列表。

public class KafkaBrokerInfoFetcher {

    public static void main(String[] args) throws Exception {
        ZooKeeper zk = new ZooKeeper("localhost:2181", 10000, null);
        List<String> ids = zk.getChildren("/brokers/ids", false);
        for (String id : ids) {
            String brokerInfo = new String(zk.getData("/brokers/ids/" + id, false, null));
            System.out.println(id + ": " + brokerInfo);
        }
    }
}

将代码运行到由三个代理组成的集群上会产生

1: {"jmx_port":-1,"timestamp":"1428512949385","host":"192.168.0.11","version":1,"port":9093}
2: {"jmx_port":-1,"timestamp":"1428512955512","host":"192.168.0.11","version":1,"port":9094}
3: {"jmx_port":-1,"timestamp":"1428512961043","host":"192.168.0.11","version":1,"port":9095}

答案 1 :(得分:14)

事实证明,Kafka使用ZKStringSerializer来读取和写入数据到znodes。因此,要修复错误,我只需将其添加为ZkClient构造函数中的最后一个参数:

val zkClient = new ZkClient(zkQuorum, Integer.MAX_VALUE, 10000, ZKStringSerializer)

使用它,我写了几个有用的函数来发现经纪人id,他们的地址和其他东西:

import kafka.utils.Json
import kafka.utils.ZKStringSerializer
import kafka.utils.ZkUtils
import org.I0Itec.zkclient.ZkClient
import org.apache.kafka.common.KafkaException


def listBrokers(): List[Int] = {
  zkClient.getChildren("/brokers/ids").toList.map(_.toInt)
}

def listTopics(): List[String] = {
  zkClient.getChildren("/brokers/topics").toList
}

def listPartitions(topic: String): List[Int] = {
  val path = "/brokers/topics/" + topic + "/partitions"
  if (zkClient.exists(path)) {
    zkClient.getChildren(path).toList.map(_.toInt)
  } else {
    throw new KafkaException(s"Topic ${topic} doesn't exist")
  }
}

def getBrokerAddress(brokerId: Int): (String, Int) = {
  val path = s"/brokers/ids/${brokerId}"
  if (zkClient.exists(path)) {
    val brokerInfo = readZkData(path)
    (brokerInfo.get("host").get.asInstanceOf[String], brokerInfo.get("port").get.asInstanceOf[Int])
  } else {
    throw new KafkaException("Broker with ID ${brokerId} doesn't exist")
  }
}

def getLeaderAddress(topic: String, partitionId: Int): (String, Int) = {
  val path = s"/brokers/topics/${topic}/partitions/${partitionId}/state"
  if (zkClient.exists(path)) {
    val leaderStr = zkClient.readData[String](path)
    val leaderId = Json.parseFull(leaderStr).get.asInstanceOf[Map[String, Any]].get("leader").get.asInstanceOf[Int]
    getBrokerAddress(leaderId)
  } else {
    throw new KafkaException(s"Topic (${topic}) or partition (${partitionId}) doesn't exist")
  }
}

答案 2 :(得分:6)

使用shell执行此操作:

zookeeper-shell myzookeeper.example.com:2181
ls /brokers/ids
  => [2, 1, 0]
get /brokers/ids/2
get /brokers/ids/1
get /brokers/ids/0 

答案 3 :(得分:3)

实际上,Kafka中有ZkUtils(至少对于0.8.x行),您可以使用一个小警告:您需要重新实现将字符串转换为UTF的ZkStringSerializer 8个编码的字节数组。如果您想使用Java8的流API,可以通过scala.collection.JavaConversions迭代Scala集合。这对我的案子有所帮助。

答案 4 :(得分:2)

MyAPIInterface *__weak weakSelf = self;