我有一个使用以下嵌套格式设置的JSON文件。
[
{
"unitCode": "ABCD",
"bedType": "Adult MT/MS",
"census": 13,
"subCensus": null,
"censusDetails": [],
"occupancy": 62,
"occupancyStar": null,
"occupancyAlertStatus": null,
"columns": [
{
"id": "blockedBeds",
"value": "1",
"hoverDetails": [
{
"id": "bedName",
"value": "23_1"
}
]
},
{
"id": "unOccupied",
"value": "2",
"hoverDetails": [
{
"id": "bedName",
"value": "20a_2"
},
{
"id": "bedName",
"value": "22a_1"
}
]
}
],
"codeEvents": null,
"codeEventDetails": null
},
{
"unitCode": "EFGH",
"bedType": "Adult MT/MS",
"census": 14,
"subCensus": null,
"censusDetails": [],
"occupancy": 61,
"occupancyStar": null,
"occupancyAlertStatus": null,
"columns": [
{
"id": "blockedBeds",
"value": "1",
"hoverDetails": [
{
"id": "bedName",
"value": "52_2"
}
]
},
{
"id": "unOccupied",
"value": "1",
"hoverDetails": [
{
"id": "bedName",
"value": "53_1"
}
]
}
],
"codeEvents": null,
"codeEventDetails": null
}
]
我正在尝试展平该文件,并使用json_normalize
将其转换为数据帧。
这是我的代码:
testhover = json_normalize(data, ['columns'],['unitCode'])
我得到的数据帧如下:
id | value | hoverDetails | unitCode
0 blockedBeds | 1 | [{'id': 'bedName', 'value': '23_1'}] | ABCD
1 unOccupied | 2 | [{'id': 'bedName', 'value': '20a_2'}, {'id': '...' | ABCD
2 blockedBeds | 1 | [{'id': 'bedName', 'value': '52_2'}] | EFGH
3 unOccupied | 1 | [{'id': 'bedName', 'value': '53_1'}] | EFGH
我需要以下格式:
blockedBeds | unOccupied | unitCode
0 | '23_1' | NaN | ABCD
1 | NaN | '20a_2' | ABCD
2 | NaN | '22a_1' | ABCD
3 | '52_2' | NaN | EFGH
4 | NaN | '53_1' | EFGH
我似乎无法获取嵌套床数据。 我非常感谢您的帮助。
答案 0 :(得分:3)
您应该从循环中创建字典列表,并使用该列表创建数据框。
vals = []
for item in parsed_json:
unit_code = item['unitCode']
for col in item['columns']:
for hd in col['hoverDetails']:
vals.append({'unitCode': unit_code,
col['id']: hd['value']})
pd.DataFrame(vals)
输出
unitCode blockedBeds unOccupied
0 ABCD 23_1 NaN
1 ABCD NaN 20a_2
2 ABCD NaN 22a_1
3 EFGH 52_2 NaN
4 EFGH NaN 53_1