我正在使用一个标准数据框,并使用嵌套数组创建摘要数据的各种子集数据框。然后,我需要以给我预期的JSON输出的方式组合子集数据帧。 (我使用MaxU的答案来格式化大部分代码; Convert Pandas Dataframe to nested JSON)
我的标准数据框的前几行(如有必要,我可以提供此示例中的所有58行):df
ID PRI_AFF PRI_DEP LOA STATE
0 5571 M Basic A
1 5030 T 14700000 Blue A
2 5030 T 14700000 Blue A
3 5030 T 14700000 Blue A
4 4014 T 14700000 Blue A
5 2230 T 14700000 UFM A
6 2230 T 14700000 UFM A
7 2150 F 95011000 Bronze A
8 2150 F 95011000 Bronze A
9 2150 F 95011000 Bronze A
10 2150 F 95011000 Bronze A
在这里,我使用以下Python:
PAFF_df = pd.DataFrame(df.groupby(['PRI_DEP','PRI_AFF'])['ID'].nunique().unstack().reset_index().fillna(0))
LOA_df = pd.DataFrame(df.groupby(['PRI_DEP','LOA'])['ID'].nunique().unstack().reset_index().fillna(0))
ST_df = pd.DataFrame(df.groupby(['PRI_DEP','STATE'])['ID'].nunique().unstack().reset_index().fillna(0))
Nested_PAFF_df = (PAFF_df.groupby(['PRI_DEP'], as_index=True)
.apply(lambda x: x[['A','E','F','L','M','T']].to_dict('r'))
.reset_index()
.rename(columns={0:'Primary_Affiliation'}))
Nested_LOA_df = (LOA_df.groupby(['PRI_DEP'], as_index=True)
.apply(lambda x: x[['Basic','Blue','Bronze','Invalid','UFM']].to_dict('r'))
.reset_index()
.rename(columns={0:'LOA'}))
Nested_ST_df = (ST_df.groupby(['PRI_DEP'], as_index=True)
.apply(lambda x: x[['A','E']].to_dict('r'))
.reset_index()
.rename(columns={0:'STATE'}))
哪个可以使用.to_json(orient ='records')
给我合适的嵌套JSON主要关联JSON:
[{"PRI_DEP":" ","Primary_Affiliation":[{"A":0.0,"E":0.0,"F":0.0,"M":2.0,"L":0.0,"T":0.0}]},{"PRI_DEP":"14700000","Primary_Affiliation":[{"A":0.0,"E":3.0,"F":0.0,"M":1.0,"L":1.0,"T":19.0}]},{"PRI_DEP":"95011000","Primary_Affiliation":[{"A":0.0,"E":0.0,"F":1.0,"M":0.0,"L":0.0,"T":0.0}]},{"PRI_DEP":"Null","Primary_Affiliation":[{"A":0.0,"E":1.0,"F":0.0,"M":0.0,"L":0.0,"T":0.0}]},{"PRI_DEP":"ST010000","Primary_Affiliation":[{"A":1.0,"E":0.0,"F":0.0,"M":0.0,"L":0.0,"T":1.0}]}]
LOA JSON:
[{"PRI_DEP":" ","LOA":[{"Blue":0.0,"UFM":0.0,"Invalid":0.0,"Bronze":1.0,"Basic":1.0}]},{"PRI_DEP":"14700000","LOA":[{"Blue":14.0,"UFM":5.0,"Invalid":1.0,"Bronze":4.0,"Basic":0.0}]},{"PRI_DEP":"95011000","LOA":[{"Blue":0.0,"UFM":0.0,"Invalid":0.0,"Bronze":1.0,"Basic":0.0}]},{"PRI_DEP":"Null","LOA":[{"Blue":0.0,"UFM":0.0,"Invalid":0.0,"Bronze":1.0,"Basic":0.0}]},{"PRI_DEP":"ST010000","LOA":[{"Blue":0.0,"UFM":0.0,"Invalid":1.0,"Bronze":0.0,"Basic":1.0}]}]
状态JSON:
[{"PRI_DEP":" ","STATE":[{"A":2.0,"E":0.0}]},{"PRI_DEP":"14700000","STATE":[{"A":23.0,"E":1.0}]},{"PRI_DEP":"95011000","STATE":[{"A":1.0,"E":0.0}]},{"PRI_DEP":"Null","STATE":[{"A":1.0,"E":0.0}]},{"PRI_DEP":"ST010000","STATE":[{"A":2.0,"E":0.0}]}]
现在,我想以某种方式通过PRI_DEP将这些全部表示为一个JSON。
因此所需的JSON将是这样的(已更新,以便于阅读):
[{"PRI_DEP":" ",
"Primary_Affiliation":
[{"A":0.0,"E":0.0,"F":0.0,"M":2.0,"L":0.0,"T":0.0}],
"LOA":
[{"Blue":0.0,"UFM":0.0,"Invalid":0.0,"Bronze":1.0,"Basic":1.0}],
"STATE":
[{"A":2.0,"E":0.0}]},
{"PRI_DEP":"14700000",
"Primary_Affiliation":
[{"A":0.0,"E":3.0,"F":0.0,"M":1.0,"L":1.0,"T":19.0}],
"LOA":
[{"Blue":14.0,"UFM":5.0,"Invalid":1.0,"Bronze":4.0,"Basic":0.0}],
"STATE":
[{"A":23.0,"E":1.0}]},
{"PRI_DEP":"95011000",
"Primary_Affiliation":
[{"A":0.0,"E":0.0,"F":1.0,"M":0.0,"L":0.0,"T":0.0}],
"LOA":
[{"Blue":0.0,"UFM":0.0,"Invalid":0.0,"Bronze":1.0,"Basic":0.0}],
"STATE":
[{"A":1.0,"E":0.0}]},
{"PRI_DEP":"Null",
"Primary_Affiliation":
[{"A":0.0,"E":1.0,"F":0.0,"M":0.0,"L":0.0,"T":0.0}],
"LOA":
[{"Blue":0.0,"UFM":0.0,"Invalid":0.0,"Bronze":1.0,"Basic":0.0}],
"STATE":
[{"A":1.0,"E":0.0}]},
{"PRI_DEP":"ST010000",
"Primary_Affiliation":
[{"A":1.0,"E":0.0,"F":0.0,"M":0.0,"L":0.0,"T":1.0}],
"LOA":
[{"Blue":0.0,"UFM":0.0,"Invalid":1.0,"Bronze":0.0,"Basic":1.0}],
"STATE":
[{"A":2.0,"E":0.0}]}]
答案 0 :(得分:0)
我只是一直在用不同的方式来组合数据框,我想我已经找到了答案。
在我的原始文章(设置嵌套组)中的python代码之后,我执行了以下操作:
Group_frames = [Nested_PAFF_df.set_index('PRI_DEP'), Nested_LOA_df.set_index('PRI_DEP'), Nested_ST_df.set_index('PRI_DEP')]
result = pd.concat(Group_frames, axis=1).reset_index()
print(result.to_json(orient='records'))