MQTT Kafka源连接器:有趣的字节字符

时间:2018-12-18 09:13:37

标签: apache-kafka mqtt apache-kafka-connect mosquitto

我正在关注https://github.com/kaiwaehner/kafka-connect-iot-mqtt-connector-example,以通过MQTT源连接器连接Mosquitto和Kafka。我正在将Mosquitto发布者发送的数据发送到Mosquitto订户和Kafka消费者中。但是我的kafka-consumer的ConsumerRecord对象中的键和值字段具有一些前置字节字符。 下面是代码片段和我得到的输出。

mqttPublisher.py

while v3 < 3:
             data3 = {
                      "time": str(datetime.datetime.now().time()),
                       "val": v3
                      }
             client.publish("sensor/dist", json.dumps(data3), qos=2)

             v3 += 1
             time.sleep(2)

mqttSubscriber.py

def on_message_print(client, userdata, message):
            print(message.topic,message.payload)

subscribe.callback(on_message_print, "sensor/#", hostname="localhost")

kafkaConsumer.py

consumer = KafkaConsumer('mqtt.',
                     bootstrap_servers=['localhost:9092'])

for message in consumer:
   print(message)

输出:mqttSubscriber.py

sensor / dist b'{“ time”:“ 12:44:30.817462”,“ val”:0}'

sensor / dist b'{“ time”:“ 12:44:32.820040”,“ val”:1}'

sensor / dist b'{“ time”:“ 12:44:34.822657”,“ val”:2}'

输出:kafkaConsumer.py

ConsumerRecord(topic ='mqtt。',partition = 0,offset = 225,timestamp = 1545117270870,timestamp_type = 0, key = b'\ x00 \ x00 \ x00 \ x00 \ x01 \ x16sensor / dist' value = b'\ x00 \ x00 \ x00 \ x00 \ x02J {“ time”:“ 12:44:30.817462”,“ val”:0} ',标头= [ ('mqtt.message.id',b'0'),('mqtt.qos',b'0'),('mqtt.retained',b'false'),('mqtt.duplicate',b' false')],校验和=无,serialized_key_size = 17,serialized_value_size = 43,serialized_header_size = 62)

ConsumerRecord(topic ='mqtt。',partition = 0,offset = 226,timestamp = 1545117272821,timestamp_type = 0, key = b'\ x00 \ x00 \ x00 \ x00 \ x01 \ x16sensor / dist' value = b'\ x00 \ x00 \ x00 \ x00 \ x02J {“ time”:“ 12:44:32.820040”,“ val”:1}',标题= [ ('mqtt.message.id',b'0'),('mqtt.qos',b'0'),('mqtt.retained',b'false'),('mqtt.duplicate',b' false')],校验和=无,serialized_key_size = 17,serialized_value_size = 43,serialized_header_size = 62)

ConsumerRecord(topic ='mqtt。',partition = 0,offset = 227,timestamp = 1545117274824,timestamp_type = 0, key = b'\ x00 \ x00 \ x00 \ x00 \ x01 \ x16sensor / dist' value = b'\ x00 \ x00 \ x00 \ x00 \ x02J {“ time”:“ 12:44:34.822657”,“ val”:2}',标题= [ ('mqtt.message.id',b'0'),('mqtt.qos',b'0'),('mqtt.retained',b'false'),('mqtt.duplicate',b' false')],校验和=无,serialized_key_size = 17,serialized_value_size = 43,serialized_header_size = 62)

是什么导致上述Kafka Consumer中多余的字节在前? 预先感谢。

1 个答案:

答案 0 :(得分:0)

作为演示的一部分,您正在启动Schema Registry

  

启动Kafka Connect和依赖项(Kafka,Zookeeper,模式注册表):

     

confluent start connect

如果您查看前5个字节,则会发现它们以0开头,然后是另外四个代表整数的字节。

请参见Schema Registry Wire Format,然后尝试执行curl localhost:8081/subjects,看看它是否列出了mqtt-keymqtt-value的主题名称。

如果您不想使用Avro,则需要配置和编辑您的Kafka Connect属性文件以使用其他Converter,并且除了使Kafka和Zookeeper运行以外,不使用confluent start

或者,如果您希望Python反序列化Avro,则可以参考Github上的confluent-kafka-python存储库