以下是我的数据框的输出:
0 1
0 {"time": "2016-03-28T23:23:12Z" "target": "Raffi-Antilian"}
1 {"time": "2016-03-28T23:23:12Z" "target": "Caroline-Kaiser"}
如何将单个记录从类型字典转换为普通数据帧记录,列名是字典键,记录值是字典值?我想要的输出应该是:
Time Target
0 2016-03-28T23:23:12Z Raffi-Antilian
1 2016-03-28T23:23:12Z Caroline-Kaiser
我有大约2000条记录,感谢任何帮助/指导。
答案 0 :(得分:3)
import json
data = []
with open('filename', 'r') as f:
for line in f:
data.append(json.loads(line))
pd.DataFrame(data)
给出
Out[49]:
target time
0 Raffi-Antilian 2016-03-28T23:23:12Z
1 Caroline-Kaiser 2016-03-28T23:23:12Z
答案 1 :(得分:1)
如果文件不 sep=';'
,则read_csv
;
可以Help Center,因此所有数据都在一个Series
中。然后按string
将dictionary
转换为ast.literal_eval
,最后使用pd.DataFrame
:
import pandas as pd
import ast
import io
temp=u"""{"time": "2016-03-28T23:23:12Z","target": "Raffi-Antilian"}
{"time": "2016-03-28T23:23:12Z","target": "Caroline-Kaiser"}"""
#after testing replace io.StringIO(temp) to filename
s = pd.read_csv(io.StringIO(temp), index_col=None, header=None, sep=';', squeeze=True)
print (s)
0 {"time": "2016-03-28T23:23:12Z","target": "Raf...
1 {"time": "2016-03-28T23:23:12Z","target": "Car...
Name: 0, dtype: object
L = s.apply(lambda x: ast.literal_eval(x)).tolist()
print (L)
[{'time': '2016-03-28T23:23:12Z', 'target': 'Raffi-Antilian'},
{'time': '2016-03-28T23:23:12Z', 'target': 'Caroline-Kaiser'}]
print (pd.DataFrame(L))
target time
0 Raffi-Antilian 2016-03-28T23:23:12Z
1 Caroline-Kaiser 2016-03-28T23:23:12Z
编辑:
另一个单行解决方案:
import pandas as pd
import json
print (pd.DataFrame([json.loads(line.strip()) for line in open('file.txt')]))
target time
0 Raffi-Antilian 2016-03-28T23:23:12Z
1 Caroline-Kaiser 2016-03-28T23:23:12Z