我有一个示例数据,如下。以下属性属于[data]字典。在“ XXXX”中,我的值是“ Naveen”,在“ YYYYY”中,我的值是“ Kumar”和“ Rajesh”。我正在尝试使用下面的代码来获取2条记录的输出
{
"data": [
{
"Empid": "1234",
"Empname": "ABC",
"data1": {
"XXXX": [
{
"relative": {
"id": "Naveen"
}
}
],
"YYYYY": [
{
"relative": {
"id": "Kumar"
}
},
{
"relative": {
"id": "Rajesh"
}
}
]
}
}
]
}
请找到以下代码(我正在尝试)
df = pd.DataFrame()
for i in range(len(json_file['data'])):
temp = {}
temp['Empid'] = json_file['data'][i]['Empid']
temp['EmpName'] = json_file['data'][i]['EmpName']
for key in json_file['data'][i]['data1'].keys():
try:
for j in range(len(json_file['data'][i]['data1'][key])):
temp[key] = json_file['data'][i]['data1'][key][j]['relative']['id']
except:
temp[key] = None
temp_df = pd.DataFrame([temp])
df = pd.concat([df, temp_df], sort=True)
我想要实现的最终输出
EmpID EmpName XXXX YYYYY
1234 ABC Naveen Kumar
1234 ABC Nan Rajesh
但是我只得到1条记录
EmpID EmpName XXXX YYYYY
1234 ABC Naveen Rajesh
如有任何建议,请帮助我
答案 0 :(得分:0)
一个修改代码的长解决方案,因此可以增加一个循环,更改索引以及修改range
参数:
df = pd.DataFrame()
num = max([len(v) for k,v in json_file['data'][0]['data1'].items()])
for i in range(num):
temp = {}
temp['Empid'] = json_file['data'][0]['Empid']
temp['Empname'] = json_file['data'][0]['Empname']
for key in json_file['data'][0]['data1'].keys():
if key not in temp:
temp[key] = []
try:
for j in range(len(json_file['data'][0]['data1'][key])):
temp[key].append(json_file['data'][0]['data1'][key][j]['relative']['id'])
except:
temp[key] = None
temp_df = pd.DataFrame([temp])
df = pd.concat([df, temp_df],ignore_index=True)
for i in json_file['data'][0]['data1'].keys():
df[i] = pd.Series([x for y in df[i].tolist() for x in y]).drop_duplicates()
现在:
print(df)
是:
Empid Empname XXXX YYYYY
0 1234 ABC Naveen Kumar
1 1234 ABC NaN Rajesh