如何将msgpack转换为json格式

时间:2019-07-01 15:26:39

标签: python json apache-kafka kafka-consumer-api msgpack

我已经使用kafka来流式传输使用msgpack编码的消息。之后,我使用msgpack解码消息。但是我还是找不到对齐或格式化消息的方式,以使其更具可读性。

consumer = KafkaConsumer(
   'frontier-done',
   bootstrap_servers=['localhost:9092'],
   auto_offset_reset='smallest',
   value_deserializer=lambda x: msgpack.loads(x, encoding='utf-8'))

输出/消息

[b'pc', [b'https://en.wikipedia.org/wiki/SMS', 200, {b'scrapy_callback': None, b'scrapy_errback': None, b'scrapy_meta': {b'link_text': b'Short Message Service', b'download_timeout': 180.0, b'download_slot': b'en.wikipedia.org', b'download_latency': 0.04313206672668457, b'depth': 0}, b'origin_is_frontier': True, b'domain': {b'netloc': b'en.wikipedia.org', b'name': b'en.wikipedia.org', b'scheme': b'https', b'sld': b'', b'tld': b'', b'subdomain': b'', b'fingerprint': b'0acd465bbb0ec47c393eee1b4ae069f228dde142'}, b'fingerprint': b'7b2bc785328543b718bf06be33c59bbaa89a2793', b'state': 0, b'score': 1.0, b'jid': 0, b'encoding': b'utf-8'}, {b'Date': [b'Tue, 02 Jul 2019 08:07:17 GMT'], b'Content-Type': [b'text/html; charset=UTF-8'], b'Server': [b'mw1319.eqiad.wmnet'], b'X-Content-Type-Options': [b'nosniff'], b'P3P': [b'CP="This is not a P3P policy! See https://en.wikipedia.org/wiki/Special:CentralAutoLogin/P3P for more info."'], b'X-Powered-By': [b'HHVM/3.18.6-dev'], b'Content-Language': [b'en'], b'Last-Modified': [b'Mon, 01 Jul 2019 16:28:27 GMT'], b'Backend-Timing': [b'D=180346 t=1561998535634972'], b'Vary': [b'Accept-Encoding,Cookie,Authorization,X-Seven'], b'X-Varnish': [b'238585272 211131050, 146864479 137022708, 329891973 239570064, 789362513 563374386'], b'Via': [b'1.1 varnish (Varnish/5.1), 1.1 varnish (Varnish/5.1), 1.1 varnish (Varnish/5.1), 1.1 varnish (Varnish/5.1)'], b'Age': [b'56300'], b'X-Cache': [b'cp1075 hit/7, cp2019 hit/3, cp5007 hit/3, cp5008 hit/8'], b'X-Cache-Status': [b'hit-front'], b'Server-Timing': [b'cache;desc="hit-front"'], b'Strict-Transport-Security': [b'max-age=106384710; includeSubDomains; preload'], b'X-Analytics': [b'ns=0;page_id=28207;WMF-Last-Access=02-Jul-2019;WMF-Last-Access-Global=02-Jul-2019;https=1'], b'X-Client-Ip': [b'61.6.17.213'], b'Cache-Control': [b'private, s-maxage=0, max-age=0, must-revalidate'], b'Accept-Ranges': [b'bytes']}, None]]

所以我认为最好的方法是将消息转换为json格式。由于json格式可以使用JSON Pretty Print。

1 个答案:

答案 0 :(得分:0)

输出包含字节值。您需要先对其进行解码。

以下是基于https://stackoverflow.com/a/57014807/5312776的可行示例。

import json

def decode_list(l):
    result = []
    for item in l:
        if isinstance(item, bytes):
            result.append(item.decode())
            continue
        if isinstance(item, list):
            result.append(decode_list(item))
            continue
        if isinstance(item, dict):
            result.append(decode_dict(item))
            continue
        result.append(item)
    return result

def decode_dict(d):
    result = {}
    for key, value in d.items():
        if isinstance(key, bytes):
            key = key.decode()
        if isinstance(value, bytes):
            value = value.decode()
        if isinstance(value, list):
            value = decode_list(value)
        elif isinstance(value, dict):
            value = decode_dict(value)
        result.update({key: value})
    return result

text = [b'pc', [b'https://en.wikipedia.org/wiki/SMS', 200, {b'scrapy_callback': None, b'scrapy_errback': None, b'scrapy_meta': {b'link_text': b'Short Message Service', b'download_timeout': 180.0, b'download_slot': b'en.wikipedia.org', b'download_latency': 0.04313206672668457, b'depth': 0}, b'origin_is_frontier': True, b'domain': {b'netloc': b'en.wikipedia.org', b'name': b'en.wikipedia.org', b'scheme': b'https', b'sld': b'', b'tld': b'', b'subdomain': b'', b'fingerprint': b'0acd465bbb0ec47c393eee1b4ae069f228dde142'}, b'fingerprint': b'7b2bc785328543b718bf06be33c59bbaa89a2793', b'state': 0, b'score': 1.0, b'jid': 0, b'encoding': b'utf-8'}, {b'Date': [b'Tue, 02 Jul 2019 08:07:17 GMT'], b'Content-Type': [b'text/html; charset=UTF-8'], b'Server': [b'mw1319.eqiad.wmnet'], b'X-Content-Type-Options': [b'nosniff'], b'P3P': [b'CP="This is not a P3P policy! See https://en.wikipedia.org/wiki/Special:CentralAutoLogin/P3P for more info."'], b'X-Powered-By': [b'HHVM/3.18.6-dev'], b'Content-Language': [b'en'], b'Last-Modified': [b'Mon, 01 Jul 2019 16:28:27 GMT'], b'Backend-Timing': [b'D=180346 t=1561998535634972'], b'Vary': [b'Accept-Encoding,Cookie,Authorization,X-Seven'], b'X-Varnish': [b'238585272 211131050, 146864479 137022708, 329891973 239570064, 789362513 563374386'], b'Via': [b'1.1 varnish (Varnish/5.1), 1.1 varnish (Varnish/5.1), 1.1 varnish (Varnish/5.1), 1.1 varnish (Varnish/5.1)'], b'Age': [b'56300'], b'X-Cache': [b'cp1075 hit/7, cp2019 hit/3, cp5007 hit/3, cp5008 hit/8'], b'X-Cache-Status': [b'hit-front'], b'Server-Timing': [b'cache;desc="hit-front"'], b'Strict-Transport-Security': [b'max-age=106384710; includeSubDomains; preload'], b'X-Analytics': [b'ns=0;page_id=28207;WMF-Last-Access=02-Jul-2019;WMF-Last-Access-Global=02-Jul-2019;https=1'], b'X-Client-Ip': [b'61.6.17.213'], b'Cache-Control': [b'private, s-maxage=0, max-age=0, must-revalidate'], b'Accept-Ranges': [b'bytes']}, None]]

print(decode_list(text))