Cloudera集群中的Python Kafka制作人和消费者

时间:2017-09-01 13:37:48

标签: python apache-kafka cloudera kafka-producer-api

我在3台不同的机器上安装了3个Broker的cloudera集群。我正在从集群内的第四个开发。

我创建的主题如下: 创建主题 / usr / bin / kafka-topics --zookeeper host:2181,host2:2181,hosts3:2181 / kafka --create --partitions 10 --rerelication-factor 2 --topic topicname

我在zookeeper中的根目录不是root,而是/ kafka

这是我的制片人代码:

class Kafkaproducer(object):
    def __init__(self, **kwargs):
        if kwargs:
            try:
                self.producer = KafkaProducer(**kwargs)
            except Exception as ex:
                print "unable to create Producer Object " + str(ex)
            self.iw = Imageworker()
            log = Logger()
            self.logs = log.logger('Producer')


    def set_topic(self, topic):
        """
        Set Topic for Producer
        :param self:
        :param topic: Topic String for Kafka
        :return: no value
        """
        self.topic = topic
        print self.producer.partitions_for(topic )


    def send_message(self, file):
        """
        send a single message to kafka broker
        :param self:
        :param file: absolute filepath from file to send to broker
        :return: no value
        """
        print self.topic
        try:
            print "create json message .. "
            message = self.iw.read_image_file(file)
        except Exception as ex:
            print "unable to read file" + str(ex)
        try:
            print "send message"+ self.iw.get_imagename(file)
            self.producer.send(self.topic, message)
        except Exception as Ex:
            print "unable to send kafka message " + str(ex)

    def _handle_fetch_response(self):
        print "error"

    def send_message_synchron(self, file ):
        """

        :param data:
        :return:
        """
        try:
            print "create json message .. "
            message = self.iw.read_image_file(file)

        except Exception as ex:
            print "unable to read file" + str(ex)
        try:
            #print "send message "+ self.iw.get_imagename(file)
            future = self.producer.send(self.topic, message)
            future.error_on_callbacks=True
            #result = future.get(timeout=1000)
            result = future.succeeded()

            print future.is_done
            if result:
                print future.value
                print result
                print "success!!!"
                meta = future.get(timeout=100)
        except Exception as ex:
            print "unable to send kafka message " + str(ex)
        try:
            if future.is_done:
                print "Message send successful "
        except KafkaError:
            log.exception()
            print "Error in Kafka"
            pass


    def flush_producer(self):
        self.producer.flush()

我能够与send_messages函数异步发送消息。 此外,我从使用的主题中获取分区数。 问题是,消息消失了。

我已经使用我的python使用者和以下语句检查了两次:

/opt/cloudera/parcels/KAFKA-2.2.0-1.2.2.0.p0.68/lib/kafka/bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list myhosts - 主题topic_name

此外,我想发送带有同步功能的消息 得到未来的结果。 在这里,我无法获得未来的结果。这条线 result = future.get(timeout = 1000)失败。

希望有人在这种情况下有一个想法。 非常感谢,

JORN

1 个答案:

答案 0 :(得分:0)

发现问题,但不知道如何修复它。 我从属性文件中读取了生产者配置

bootstrap_servers=['h1:9092' ,'h2:9092','h3:9092']
api_version=(0,10)
value_serializer=str.encode
buffer_memory=200000000
retries=5
max_block_ms=10000

producer = Kafkaproducer(**dic)  # do not work
roducer = Kafkaproducer(bootstrap_servers=['h1:9092' ,'h2:9092','h3:9092'],api_version=(0,10)...   # works well

在消费者网站上,我可以使用 消费者= Kafkaconsumer(** dic)

修复生产者调用后,同步错误战也消失了。 但为什么我不能用字典给制作人打电话呢?

- > {'retries':5,'max_block_ms':10000,'buffer_memory':200000000,'bootstrap_servers':['h1:9092','h2:9092','h3:9092'],'value_serializer':'str。编码','api_version':( 0,10)}

谢谢