如何从日志中提取和解析json

时间:2018-12-15 16:06:23

标签: python json regex logging

我是自动化测试的新手。我遇到问题,我想从日志中选择json格式信息,然后在python中解析它们。原始日志如下:

  

2-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):x分片:loc = 118.7234160,32.0320550   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):主持人:stargate.ele.me   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):连接:保持活动   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):接受编码:gzip   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):用户代理:okhttp / 3.5.0   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):{   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):“ transactionId”:“ 4ac50bcb358d376d4719a413b31c4786”,   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):“ commandType”:“ UNLOCK”,   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):“ deviceId”:“ CD1103929”,   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):“令牌”:“ CD1103929”,   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):“ resultDetails”:“成功”,   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):“ invokerType”:“ USEREND”,   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):“ logisticsOrderCategory”:0,   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):“ logisticsOrderId”:0,   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):“ commandAt”:1544759360619,   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):“ invokerId”:96944200,   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):“ deviceUnlockTime”:162,   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):“ logisticsOrderType”:0   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):}

我尝试在Web regrex101上使用正则表达式:\{(?:[^\{\}]|\{(?:[^\{\}]|\{(?:[^\{\}]|\{(?:[^\{\}]|\{(?:[^\{\}]|\{(?:[^\{\}]|\{(?:[^\{\}]|\{(?:[^\{\}]|\{(?:[^\{\}]|\{(?:[^\{\}]|w+)*\})*\})*\})*\})*\})*\})*\})*\})*\})*\},并以csv格式导出,但是我得到了:

  

12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):“” transactionId“”:“”“ 4ac50bcb358d376d4719a413b31c4786”“,   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):“” commandType“”:“” UNLOCK“”,   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):“” deviceId“”:“” CD1103929“”,   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):“”令牌“”:“” CD1103929“”,   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):“” resultDetails“”:“”成功“”,   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):“” invokerType“”:“” USEREND“”,   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):“” logisticsOrderCategory“”:0,   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):“” logisticsOrderId“”:0,   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):“” commandAt“”:1544759360619,   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):“” invokerId“”:96944200,   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):“” deviceUnlockTime“”:162,   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):“” logisticsOrderType“”:0   12-14 11:49:23.869 D / me.ele.minimart.http.interceptor.HttpLogger(859):}“

但是我真正想要的是这样的:

  

{       “ transactionId”:“ 4ac50bcb358d376d4719a413b31c4786”,       “ commandType”:“ UNLOCK”,       “ deviceId”:“ CD1103929”,       “令牌”:“ CD1103929”,       “ resultDetails”:“成功”,       “ invokerType”:“ USEREND”,       “ logisticsOrderCategory”:0,       “ logisticsOrderId”:0,       “ commandAt”:1544759360619,       “ invokerId”:96944200,       “ deviceUnlockTime”:162,       “ logisticsOrderType”:0    }

删除无用的单词。所以我怎么能得到json格式的结果呢?在regrex表达式中可能会有一些错误。

非常感谢!

1 个答案:

答案 0 :(得分:1)

我认为不需要使用正则表达式从日志中提取JSON片段。

import json
with open('origin.log') as f:
    sj = ''
    for l in f:
        l = l.rstrip()
        if l.endswith('{'):
            sj = '{'
        elif sj:
            if l.endswith('}'):
                sj += '\n}'
                js = json.loads(sj)
                print(js['transactionId'])
                sj = ''
            else:
                sj += '\n' + l.split('):')[-1]