我的文件包含多条JSON记录,如下所示:
输入文件:
{"timestamp":1487271527,"user":"Dave","action":"browse"}
{"timestamp":1487271528,"user":"Dave","action":"navigate"}
{"timestamp":1487271529,"user":"Dave","action":"browse"}
{"timestamp":1487271530,"user":"Dave","action":"view"}
{"timestamp":1487271531,"user":"Dave","action":"browse"}
{"timestamp":1487271532,"user":"Dave","action":"browse"}
{"timestamp":1487271533,"user":"Dave","action":"browse"}
{"timestamp":1487271534,"user":"Dave","action":"navigate"}
我想将这些数据加载到类似于json.load函数的字典中
我该怎么做?
使用json.load我收到以下错误:
Traceback (most recent call last):
File "C:/Users/lenovo/AppData/Local/Programs/Python/Python36-32/Granular.py", line 5, in <module>
input_data = json.load(open(r"C:\Users\lenovo\Desktop\nlp\input.txt",'r'))
File "C:\Users\lenovo\AppData\Local\Programs\Python\Python36-32\lib\json\__init__.py", line 299, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "C:\Users\lenovo\AppData\Local\Programs\Python\Python36-32\lib\json\__init__.py", line 354, in loads
return _default_decoder.decode(s)
File "C:\Users\lenovo\AppData\Local\Programs\Python\Python36-32\lib\json\decoder.py", line 342, in decode
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 57)
答案 0 :(得分:1)
示例文件中的每一行都是一个单独的JSON结构。您可能希望在文件名的扩展名中明确说明,例如,您可以使用lsjson
代表行分隔的JSON 。
在这种情况下,您必须以字符串形式读取每个字符串并将其解组为python dict:
import json
with open('example.lsjson') as fh:
data = [json.loads(line) for line in fh.readlines()]
您最终会得到一个dicts列表
from pprint import pprint
pprint(data)
[{u'action': u'browse', u'timestamp': 1487271527, u'user': u'Dave'},
{u'action': u'navigate', u'timestamp': 1487271528, u'user': u'Dave'},
{u'action': u'browse', u'timestamp': 1487271529, u'user': u'Dave'},
{u'action': u'view', u'timestamp': 1487271530, u'user': u'Dave'},
{u'action': u'browse', u'timestamp': 1487271531, u'user': u'Dave'},
{u'action': u'browse', u'timestamp': 1487271532, u'user': u'Dave'},
{u'action': u'browse', u'timestamp': 1487271533, u'user': u'Dave'},
{u'action': u'navigate', u'timestamp': 1487271534, u'user': u'Dave'}]
答案 1 :(得分:0)
您的文件缺少逗号,并且json.load将其括号读取为数组。 只需先将文件作为字符串加载,然后添加正确的标点符号:
with open('test.json', 'r') as fp:
document = fp.read()
document = '[' + document.replace('\n', ',\n') + ']'
然后将结果写入另一个文件:
with open('test2.json', 'w') as fp:
fp.write(document)
你可以用json.load打开它:
import json
with open('test2.json', 'r') as fp:
content = json.load(fp)
答案 2 :(得分:0)
只需将文件作为普通文本文件读取,然后使用json.loads方法将每行转换为JSON对象,并将其附加到列表中。
<强> EX:强>
failOverReadOnly
<强>输出:强>
import json
path = "PATH/TO/INPUT_FILE/data.json"
data = []
with open(path, "r") as infile:
for line in infile.readlines():
data.append(json.loads(line))
print data