如何解析正在运行的命令的JSON输出?

时间:2018-03-08 16:17:18

标签: python json python-3.x parsing stream

摘要:我想在输出时解析get-childitem -Path $env:SystemRoot\Temp -Filter *.itrace | where-object {$_.lastwritetime -lt (get-date).AddHours(-1)} | Foreach-Object { del $_.FullName } 的JSON输出。

截至目前,我正在逐行解析正常输出,每行都有完整的信息。因此,这是

的问题
tshark

p = subprocess.Popen("/usr/bin/tshark", stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, shell=True) for line in p.stdout: event = decode_event(line) 也可以通过tshark开关输出漂亮的JSON(我只给出第一个数据包,输出是一个列表):

-T json

解析这样一个流的正确方法是什么?

在搜索流解析时,我找到了一些库(特别是NAYA),但是它们需要一个像object这样的文件。

StringIO()似乎合适,但我不知道如何将其与[ { "_index": "packets-2018-03-08", "_type": "pcap_file", "_score": null, "_source": { "layers": { "frame": { "frame.interface_id": "0", "frame.encap_type": "1", "frame.time": "Mar 8, 2018 16:17:20.478658037 CET", "frame.offset_shift": "0.000000000", "frame.time_epoch": "1520522240.478658037", "frame.time_delta": "0.000113952", "frame.time_delta_displayed": "0.000113952", "frame.time_relative": "3.351515496", "frame.number": "11133", "frame.len": "60", "frame.cap_len": "60", "frame.marked": "0", "frame.ignored": "0", "frame.protocols": "eth:ethertype:ip:tcp" }, "eth": { "eth.dst": "00:50:56:bb:40:70", "eth.dst_tree": { "eth.dst_resolved": "Vmware_bb:40:70", "eth.addr": "00:50:56:bb:40:70", "eth.addr_resolved": "Vmware_bb:40:70", "eth.lg": "0", "eth.ig": "0" }, "eth.src": "64:a0:e7:42:af:41", "eth.src_tree": { "eth.src_resolved": "Cisco_42:af:41", "eth.addr": "64:a0:e7:42:af:41", "eth.addr_resolved": "Cisco_42:af:41", "eth.lg": "0", "eth.ig": "0" }, "eth.type": "0x00000800", "eth.padding": "00:00:00:00:00:00" }, "ip": { "ip.version": "4", "ip.hdr_len": "20", "ip.dsfield": "0x00000000", "ip.dsfield_tree": { "ip.dsfield.dscp": "0", "ip.dsfield.ecn": "0" }, "ip.len": "40", "ip.id": "0x00005a57", "ip.flags": "0x00000002", "ip.flags_tree": { "ip.flags.rb": "0", "ip.flags.df": "1", "ip.flags.mf": "0" }, "ip.frag_offset": "0", "ip.ttl": "125", "ip.proto": "6", "ip.checksum": "0x0000dd25", "ip.checksum.status": "2", "ip.src": "10.237.78.2", "ip.addr": "10.237.78.2", "ip.src_host": "10.237.78.2", "ip.host": "10.237.78.2", "ip.dst": "10.81.99.19", "ip.addr": "10.81.99.19", "ip.dst_host": "10.81.99.19", "ip.host": "10.81.99.19", "Source GeoIP: Unknown": "", "Destination GeoIP: Unknown": "" }, "tcp": { "tcp.srcport": "31316", "tcp.dstport": "22", "tcp.port": "31316", "tcp.port": "22", "tcp.stream": "0", "tcp.len": "0", "tcp.seq": "3025", "tcp.ack": "774293", "tcp.hdr_len": "20", "tcp.flags": "0x00000010", "tcp.flags_tree": { "tcp.flags.res": "0", "tcp.flags.ns": "0", "tcp.flags.cwr": "0", "tcp.flags.ecn": "0", "tcp.flags.urg": "0", "tcp.flags.ack": "1", "tcp.flags.push": "0", "tcp.flags.reset": "0", "tcp.flags.syn": "0", "tcp.flags.fin": "0", "tcp.flags.str": "\u00c2\u00b7\u00c2\u00b7\u00c2\u00b7\u00c2\u00b7\u00c2\u00b7\u00c2\u00b7\u00c2\u00b7A\u00c2\u00b7\u00c2\u00b7\u00c2\u00b7\u00c2\u00b7" }, "tcp.window_size_value": "2047", "tcp.window_size": "2047", "tcp.window_size_scalefactor": "-1", "tcp.checksum": "0x000073f4", "tcp.checksum.status": "2", "tcp.urgent_pointer": "0", "tcp.analysis": { "tcp.analysis.acks_frame": "11126", "tcp.analysis.ack_rtt": "0.000426928" } } } } }, <next packet> 相关联?

根据@omu_negru请求,特别是在NAYA的情况下,直接附加stdout,如

stdout

引发异常

import naya
import subprocess

def handle_message(event):
    print(event)

cmd = "/usr/bin/tshark -i eth0 -T json"
proc = subprocess.Popen(cmd, bufsize=0, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, shell=True)
messages = naya.stream_array(proc.stdout)
for message in messages:
    handle_message(message)

2 个答案:

答案 0 :(得分:1)

实际工作版

#!/usr/bin/python3
# tshark.py
import json, sys, time

output = sys.stdin
acc = '{'

def skip(output):
    while True:
        l = output.readline()
        if l.strip() != '{':
            continue
        else:
            break


skip(output)
print("starting")
while True:
    l = output.readline()
    if l.strip() != '':
        acc += l.strip()
    try:
        o = json.loads(acc)
        print(o)
        skip(output)
        acc = '{'
    except:
        pass

使用sudo tshark -i wlp3s0 -T json | ./tshark.py

发布

答案 1 :(得分:0)

@ omu_negru的回答给了我一个想法,我最终使用了以下解决方案。

这基本上是对JSON进行解码的连续尝试,一旦解码,它就是我进一步处理的事件(这里只打印)

import subprocess
import json


def handle_message(event):
    print(event)

cmd = "/usr/bin/tshark -n -T json not broadcast and not multicast"
proc = subprocess.Popen(cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, shell=True)
# skip first lines, until the [ which starts JSON
for line in proc.stdout:
    if line.decode().startswith('['):
        break
    else:
        continue

buffer = ""
for line in proc.stdout:
    # remove empty and "connection" lines (a comma)
    if not line.decode().strip(', \n'):
        continue
    buffer += line.decode('utf-8')
    try:
        event = json.loads(buffer)
    except json.decoder.JSONDecodeError:
        pass
    else:
        print(event)
        buffer = ""