从GroupBy Frame Pandas创建字典列表:Python

时间:2019-01-09 11:14:39

标签: python pandas

我有一个如下数据框:

Year    Month   S_ID    Channel_Name    Interaction Doc_ID  Feed_ID
2018    2       67      WhiteCoats      1           152     5776
2018    2       67      WhiteCoats      4           152     5776
2018    2       67      WhiteCoats      4           152     6046
2018    2       67      Beats           4           152     6117
2018    2       84      Beats           4           27261   6286
2018    2       84      Beats           1           9887    6286

我已将数据按列分组: 年,月,S_ID,频道名称,互动

代码:

df.groupby(['Year','Month',S_ID,Channel_Name,Interaction])

但是我想要一个包含字典列表的新列 Doc_ID,Feed_ID

结果框架应类似于:

Year    Month   S_ID    Channel_Name    Interaction Dictionary
2018    2       67      WhiteCoats      1           [{'Doc_id':152,'Feed_ID':5776}]
2018    2       67      WhiteCoats      4           [{'Doc_id':152,'Feed_ID':5776}]
2018    2       67      Beats           4           [{'Doc_id':152,'Feed_ID':6117}]
2018    2       84      Beats           4           [{'Doc_id':27261,'Feed_ID':6286},{'Doc_id':9887,'Feed_ID':6286}]

目前,我只能创建一个列表:

df.groupby(['Year','Month',S_ID,Channel_Name,Interaction])[['Doc_id','Feed_id']].apply(lambda x: x.values.tolist())

但是如何创建词典列表?

1 个答案:

答案 0 :(得分:1)

通过to_dict将值转换为lambda函数中的字典:

df1 = (df.groupby(['Year','Month','S_ID','Channel_Name','Interaction'])['Doc_ID','Feed_ID']
         .apply(lambda x: x.to_dict('r'))
         .reset_index(name='Dictionary'))
print (df1)

0  2018      2    67        Beats            4   
1  2018      2    67   WhiteCoats            1   
2  2018      2    67   WhiteCoats            4   
3  2018      2    84        Beats            1   
4  2018      2    84        Beats            4   

                                          Dictionary  
0                 [{'Doc_ID': 152, 'Feed_ID': 6117}]  
1                 [{'Doc_ID': 152, 'Feed_ID': 5776}]  
2  [{'Doc_ID': 152, 'Feed_ID': 5776}, {'Doc_ID': ...  
3                [{'Doc_ID': 9887, 'Feed_ID': 6286}]  
4               [{'Doc_ID': 27261, 'Feed_ID': 6286}]