在Python中解析JSON的文件格式不正确

时间:2018-02-02 04:28:34

标签: python json python-3.x file format

我的文件包含多条JSON记录,如下所示:

输入文件:

{"timestamp":1487271527,"user":"Dave","action":"browse"}
{"timestamp":1487271528,"user":"Dave","action":"navigate"}
{"timestamp":1487271529,"user":"Dave","action":"browse"}
{"timestamp":1487271530,"user":"Dave","action":"view"}
{"timestamp":1487271531,"user":"Dave","action":"browse"}
{"timestamp":1487271532,"user":"Dave","action":"browse"}
{"timestamp":1487271533,"user":"Dave","action":"browse"}
{"timestamp":1487271534,"user":"Dave","action":"navigate"}

我想将这些数据加载到类似于json.load函数的字典中

我该怎么做?

使用json.load我收到以下错误:

Traceback (most recent call last):
  File "C:/Users/lenovo/AppData/Local/Programs/Python/Python36-32/Granular.py", line 5, in <module>
    input_data = json.load(open(r"C:\Users\lenovo\Desktop\nlp\input.txt",'r'))
  File "C:\Users\lenovo\AppData\Local\Programs\Python\Python36-32\lib\json\__init__.py", line 299, in load
    parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
  File "C:\Users\lenovo\AppData\Local\Programs\Python\Python36-32\lib\json\__init__.py", line 354, in loads
    return _default_decoder.decode(s)
  File "C:\Users\lenovo\AppData\Local\Programs\Python\Python36-32\lib\json\decoder.py", line 342, in decode
    raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 57)

3 个答案:

答案 0 :(得分:1)

示例文件中的每一行都是一个单独的JSON结构。您可能希望在文件名的扩展名中明确说明,例如,您可以使用lsjson代表行分隔的JSON

在这种情况下,您必须以字符串形式读取每个字符串并将其解组为python dict:

import json

with open('example.lsjson') as fh:
    data = [json.loads(line) for line in fh.readlines()]

您最终会得到一个dicts列表

from pprint import pprint
pprint(data)
[{u'action': u'browse', u'timestamp': 1487271527, u'user': u'Dave'},
 {u'action': u'navigate', u'timestamp': 1487271528, u'user': u'Dave'},
 {u'action': u'browse', u'timestamp': 1487271529, u'user': u'Dave'},
 {u'action': u'view', u'timestamp': 1487271530, u'user': u'Dave'},
 {u'action': u'browse', u'timestamp': 1487271531, u'user': u'Dave'},
 {u'action': u'browse', u'timestamp': 1487271532, u'user': u'Dave'},
 {u'action': u'browse', u'timestamp': 1487271533, u'user': u'Dave'},
 {u'action': u'navigate', u'timestamp': 1487271534, u'user': u'Dave'}]

答案 1 :(得分:0)

您的文件缺少逗号,并且json.load将其括号读取为数组。 只需先将文件作为字符串加载,然后添加正确的标点符号:

with open('test.json', 'r') as fp:
    document = fp.read()
document = '[' + document.replace('\n', ',\n') + ']'

然后将结果写入另一个文件:

with open('test2.json', 'w') as fp:
    fp.write(document)

你可以用json.load打开它:

import json

with open('test2.json', 'r') as fp:
    content = json.load(fp)

答案 2 :(得分:0)

只需将文件作为普通文本文件读取,然后使用json.loads方法将每行转换为JSON对象,并将其附加到列表中。

<强> EX:

failOverReadOnly

<强>输出:

import json
path = "PATH/TO/INPUT_FILE/data.json"

data = []
with open(path, "r") as infile:
    for line in infile.readlines():
        data.append(json.loads(line))

print data