假设我有一个 API 响应:
{
"fact": {
"UP": [{
"SCODE": "CNB",
"SNAME": "Kanpur Central"
}, {
"SCODE": "JHS",
"SNAME": "Jhansi Junction"
}],
"MP": [{
"SCODE": "BPL",
"SNAME": "Bhopal Junction"
}, {
"SCODE": "JBP",
"SNAME": "Jabalpur Junction"
}]
}
}
我必须将其转换为如下所示的数据帧(预期输出):
fact SCODE SNAME
UP CNB Kanpur Central
UP JHS Jhansi Junction
MP BPL Bhopal Junction
MP JBP Jabalpur Junction
我的努力:我尝试使用 json_normalize() 但没有达到预期的输出:
pd.json_normalize(response).apply(pd.Series.explode)
答案 0 :(得分:5)
一种选择是用 python 重塑:
df = pd.DataFrame([{'fact': k, **item}
for k, lst in response['fact'].items()
for item in lst])
fact SCODE SNAME
0 UP CNB Kanpur Central
1 UP JHS Jhansi Junction
2 MP BPL Bhopal Junction
3 MP JBP Jabalpur Junction
通过 explode
+ apply
pd.Series
的 pandas
选项:
df = (
pd.DataFrame(response)['fact']
.explode()
.apply(pd.Series)
.rename_axis('fact')
.reset_index()
)
fact SCODE SNAME
0 MP BPL Bhopal Junction
1 MP JBP Jabalpur Junction
2 UP CNB Kanpur Central
3 UP JHS Jhansi Junction
答案 1 :(得分:1)
response
。json_normalize
与字典列表一起使用,并且 fact
必须包含在其中:new_response = [{"fact": rfact, **r} for rfact in response["fact"] for r in response["fact"][rfact]]
最后,你只需要应用这个函数:
final_result = pd.json_normalize(new_response)
fact SCODE SNAME
0 UP CNB Kanpur Central
1 UP JHS Jhansi Junction
2 MP BPL Bhopal Junction
3 MP JBP Jabalpur Junction
答案 2 :(得分:1)
不如直接在字典中工作那么高效(所选答案做得很好):
data = {
"fact": {
"UP": [{
"SCODE": "CNB",
"SNAME": "Kanpur Central"
}, {
"SCODE": "JHS",
"SNAME": "Jhansi Junction"
}],
"MP": [{
"SCODE": "BPL",
"SNAME": "Bhopal Junction"
}, {
"SCODE": "JBP",
"SNAME": "Jabalpur Junction"
}]
}
}
keys = data['fact']
(pd.concat([jn(data['fact'][key]) for key in keys],
keys = keys)
.droplevel(-1)
.rename_axis(index='fact')
.reset_index()
)
fact SCODE SNAME
0 UP CNB Kanpur Central
1 UP JHS Jhansi Junction
2 MP BPL Bhopal Junction
3 MP JBP Jabalpur Junction