我需要从嵌套字典列表中创建一个熊猫数据框。下面是我的字典:
[
{
"id": "e182_1234",
"stderr": {
"type": "stderr",
"upload time": "Thu Jun 25 12:24:52 +0100 2020",
"length": 3000,
"contents": [
{
"date": "20/06/25",
"time": "12:19:39",
"type": "ERROR",
"text": "Exception found\njava.io.Exception:Not initated\n at.apache.java.org........",
"line_start": 12,
"line_end": 15
},
{
"date": "20/06/25",
"time": "12:20:41",
"type": "WARN",
"text": "Warning as the node is accessed without started",
"line_start": 17,
"line_end": 17
}
]
}
}
]
我尝试使用以下代码创建数据框:
df=pd.DataFrame(filtered_data) #filtered_data is the above dictionary
res1=df.join(pd.DataFrame(df.pop("stderr").tolist()))
res2=res1.join(pd.DataFrame(res1.pop("contents").tolist()))
我得到的结果:
#df=pd.DataFrame(filtered_data)
id stderr
0 e182_1234 {'type': 'stderr', 'upload time': 'Thu Jun 25 ...
#res1=df.join(pd.DataFrame(df.pop("stderr").tolist()))
id type upload time length contents
0 e182_1234 stderr Thu Jun 25 12:24:52 +0100 2020 3000 [{'date': '20/06/25', 'time': '12:19:39', 'typ...
#res2=res1.join(pd.DataFrame(res1.pop("contents").tolist()))
id type upload time length 0 1
0 e182_1234 stderr Thu Jun 25 12:24:52 +0100 2020 3000 {'date': '20/06/25', 'time': '12:19:39', 'type... {'date': '20/06/25', 'time': '12:20:41', 'type...
当我拆分目录列表时,可以使用列名0
和1
。我希望这些列像date,time,type,text,line_start,line_end
一样被分隔为单独的列。
预期输出:
id type upload time length date time type text line_start line_end
0 e182_1234 stderr Thu Jun 25 12:24:52 +0100 2020 3000 20/06/25 12:19:39 ERROR Exception found\njava.io.Exception:Not initated\n at.apache.java.org........ 12 15
1 e182_1234 stderr Thu Jun 25 12:24:52 +0100 2020 3000 20/06/25 12:20:41 WARN WARN Warning as the node is accessed without started 17 17
如何解决此问题?预先感谢!
答案 0 :(得分:2)
您可以为此使用json_normalize
with open('1.json', 'r+') as f:
data = json.load(f)
df = pd.json_normalize(data, record_path=['stderr', 'contents'], meta=[['id'], ['stderr', 'type']])
print(df)
date time type text line_start line_end id stderr.type
0 20/06/25 12:19:39 ERROR Exception found\njava.io.Exception:Not initate... 12 15 e182_1234 stderr
1 20/06/25 12:20:41 WARN Warning as the node is accessed without started 17 17 e182_1234 stderr