如何在python中解析无效的json文件

时间:2015-07-22 18:34:58

标签: python json parsing dictionary

所以我要解析的json文件如下所示:

{
   "container_header_255_2013-12-31 16:00:45": {
  "fw_package_version": "255.255.255X255", 
  "start_timestamp": 1388534445, 
  "start_timestr": "2013-12-31 16:00:45", 
  "end_timestamp": 4294967295, 
  "end_timestr": "2106-02-06 22:28:15", 
  "length": 65535, 
  "product": "UNKNOWN", 
  "hw_version": "UNKNOWN"
   },
   "log_packet_debug_1388534445_2013-12-31 16:00:45": {
  "timestamp": 1388534445, 
  "timestr": "2013-12-31 16:00:45", 
  "log_level": "DBG", 
  "log_id": "0xC051", 
  "log_string": "DBG_STORAGE_LOG", 
  "file_name_line": "storage_data.c733", 
  "message": "Mark as Erasable: 231 238"
  },
抱歉,缩进可能有点偏差。但无论如何,我在网上看到的所有例子都包括列表,出于某种原因,这个例子只包含dictionarys

1 个答案:

答案 0 :(得分:2)

您可以使用splitstream模块(免责声明:我为其编写)(pip install splitstream)。它有一个参数startdepth,专门用于解析尚未终止或“无限”的XML / JSON流(如日志文件)。

from splitstream import splitfile
from StringIO import StringIO
import json

jsonfile = StringIO(""".....""") # your neverending JSON-sorta logfile
# Probably, you want something like this instead
#jsonfile = file("/var/log/my/log.json", "r")

# startdepth is the magic argument here: it starts splitting at depth = 1
for s in splitfile(jsonfile, format="json", startdepth=1):
  print "JSON",json.loads(s)

给出了:

JSON {u'start_timestamp': 1388534445, u'hw_version': u'UNKNOWN', u'fw_package_version': u'255.255.255X255', u'product': u'UNKNOWN', u'end_timestr': u'2106-02-06 22:28:15', u'length': 65535, u'start_timestr': u'2013-12-31 16:00:45', u'end_timestamp': 4294967295}
JSON {u'file_name_line': u'storage_data.c733', u'log_level': u'DBG', u'log_id': u'0xC051', u'timestamp': 1388534445, u'timestr': u'2013-12-31 16:00:45', u'log_string': u'DBG_STORAGE_LOG', u'message': u'Mark as Erasable: 231 238'}