无法使用Python在JSON中生成特定的组和子组
我正在尝试使用Python Pandas生成嵌套的JSON。但是,以某种方式无法弄清楚子分组是如何工作的,或者我可以生成它。不确定如何先打包子组然后再打包。
在python或pandas中是否有任何内置函数,或者任何相关的Python包都可以执行这些操作而无需编写大量代码?
我写的是下面的内容:
j = (df.groupby(['empno', 'work_id'], as_index=False)
.apply(lambda x: x[['status_id', 'type', 'languageId', 'language',
'email', 'game_name', 'experience_level', 'CellNo'
]].to_dict('r'))
.reset_index()
.rename(columns={0: 'workPostDetails'})
.to_json(orient='records'))
print("JSON::")
print(j)
样本数据:
empno work_id status_id type languageId Language send_by recived_by game_name experience_level
----- ---------- ---------- ----- ----------- --------- --------- ---------- ------------- -----------------
0017 X123 2101 email 1 All a@abc.com b@xyz.com C++ Expert
0017 X123 2103 phone 1 All +1 9282828282 +1 9383838383 A++ Intermediate
期望的JSON:
{
"empno": "0017",
"work_id": "X123",
"workPostDetails": {
"workDetails": [
{
"status_id": "2101",
"type": "email",
"languageId": "1",
"language": "All-Read-Write",
"send_by": {
"email": "a@abc.com"
},
"recived_by": [
{
"email": "b@xyz.com"
}
],
"skillDetails": [
{
"game_name": "EA Sports",
"experience_level": "Expert"
}
]
},
{
"status_id": "2103",
"type": "sms",
"languageId": "2",
"language": "All-Read",
"send_by": {
"CellNo": "+1 9282828282"
},
"recived_by": [
{
"CellNo": "+1 9383838383"
}
],
"skillDetails": [
{
"game_name": "Candy Crush",
"experience_level": "Intermediate"
}
]
}
]
}
}
答案 0 :(得分:0)
您可以先通过调整列来准备数据框,然后迭代GroupBy对象。它会给出:
# act on a copy to preserve original data
df2 = df.copy()
# prepare columns
df2[['send_by', 'recived_by']] = df[['send_by', 'recived_by']].apply(
lambda x: x.apply(lambda y: { 'email': y} if '@' in y else {'CellNo':y}))
df2['skillDetails'] = df2.apply(lambda x: {k: x[k]
for k in ('game_name','experience_level')}, axis=1)
df2.drop(columns=['game_name','experience_level'], inplace=True)
# generate the json string
j = pandas.DataFrame(((name[0], name[1],
val.loc(axis=1)['status_id':].to_dict('r'))
for name, val in df2.groupby(['empno', 'work_id'])),
columns = ['empno', 'work_id', 'workPostDetail']).to_json(orient ='records')
答案 1 :(得分:0)
这是最终版本-
df2 = df.copy()
# prepare columns
df2[['send_by', 'recived_by']] = df[['send_by', 'recived_by']].apply(
lambda x: x.apply(lambda y: { 'email': y} if '@' in y else {'CellNo':y}))
df2['skillDetails'] = df2.apply(lambda x: {k: x[k]
for k in ('game_name','experience_level')}, axis=1)
df2.drop(columns=['game_name','experience_level'], inplace=True)
df2['workDetails'] = df2.apply(lambda x1: {k1: x1[k1]
for k1 in
('status_id','type','languageId','Language','send_by','recived_by','skillDetails')}, axis=1)
df2.drop(columns=['status_id','type','languageId', 'Language', 'send_by', 'recived_by', 'skillDetails'], inplace=True)
# generate the json string
j = p.DataFrame(((name[0], name[1],
val.loc(axis=1)['workDetails':].to_dict('r'))
for name, val in df2.groupby(['empno', 'work_id'])),
columns = ['empno', 'work_id', 'workPostDetail']).to_json(orient ='records')
谢谢大家!