我有一个如下数据框:
Year Month S_ID Channel_Name Interaction Doc_ID Feed_ID
2018 2 67 WhiteCoats 1 152 5776
2018 2 67 WhiteCoats 4 152 5776
2018 2 67 WhiteCoats 4 152 6046
2018 2 67 Beats 4 152 6117
2018 2 84 Beats 4 27261 6286
2018 2 84 Beats 1 9887 6286
我已将数据按列分组: 年,月,S_ID,频道名称,互动
代码:
df.groupby(['Year','Month',S_ID,Channel_Name,Interaction])
但是我想要一个包含字典列表的新列 Doc_ID,Feed_ID
结果框架应类似于:
Year Month S_ID Channel_Name Interaction Dictionary
2018 2 67 WhiteCoats 1 [{'Doc_id':152,'Feed_ID':5776}]
2018 2 67 WhiteCoats 4 [{'Doc_id':152,'Feed_ID':5776}]
2018 2 67 Beats 4 [{'Doc_id':152,'Feed_ID':6117}]
2018 2 84 Beats 4 [{'Doc_id':27261,'Feed_ID':6286},{'Doc_id':9887,'Feed_ID':6286}]
目前,我只能创建一个列表:
df.groupby(['Year','Month',S_ID,Channel_Name,Interaction])[['Doc_id','Feed_id']].apply(lambda x: x.values.tolist())
但是如何创建词典列表?
答案 0 :(得分:1)
通过to_dict
将值转换为lambda函数中的字典:
df1 = (df.groupby(['Year','Month','S_ID','Channel_Name','Interaction'])['Doc_ID','Feed_ID']
.apply(lambda x: x.to_dict('r'))
.reset_index(name='Dictionary'))
print (df1)
0 2018 2 67 Beats 4
1 2018 2 67 WhiteCoats 1
2 2018 2 67 WhiteCoats 4
3 2018 2 84 Beats 1
4 2018 2 84 Beats 4
Dictionary
0 [{'Doc_ID': 152, 'Feed_ID': 6117}]
1 [{'Doc_ID': 152, 'Feed_ID': 5776}]
2 [{'Doc_ID': 152, 'Feed_ID': 5776}, {'Doc_ID': ...
3 [{'Doc_ID': 9887, 'Feed_ID': 6286}]
4 [{'Doc_ID': 27261, 'Feed_ID': 6286}]