我想使用分组方式并通过打印字典列表来汇总结果,其中一列用作键,另一列作为值
我的数据如下:
df = pd.DataFrame([
{'channel': 'one', 'hour': 6, 'rating':7.2},
{'channel': 'one', 'hour': 7, 'rating':8.2},
{'channel': 'one', 'hour': 8, 'rating':4.2},
{'channel': 'two', 'hour': 6, 'rating':10.2},
{'channel': 'two', 'hour': 7, 'rating':1.2},
{'channel': 'two', 'hour': 8, 'rating':3.2},
])
我尝试以下
df.groupby('channel').agg({'hour':list, 'rating':list}).reset_index()
我可以获得物品清单
channel hour rating
0 one [6, 7, 8] [7.2, 8.2, 4.2]
1 two [6, 7, 8] [10.2, 1.2, 3.2]
我的目的是获得以下信息: 频道小时rating_by_hour
0 one {6:7.2, 7:8.2, 8:4.2}
1 two {6:10.2, 7:1.2, 8:3.2}
我尝试以下操作:
df.groupby('channel').agg({'rating_by_hour':{df['hour']:df['rating']}}).reset_index()
自然,我收到一条错误消息,指出“系列”对象是可变的
答案 0 :(得分:3)
这是一种方法
df[['hour','rating']].apply(tuple,1).groupby(df['channel']).apply(list).map(dict).reset_index()
Out[168]:
channel 0
0 one {8.0: 4.2, 6.0: 7.2, 7.0: 8.2}
1 two {8.0: 3.2, 6.0: 10.2, 7.0: 1.2}
答案 1 :(得分:2)
这是另一个:
df.groupby('channel').apply(lambda x: x.set_index('hour')['rating']
.to_dict()).reset_index()
channel 0
0 one {6: 7.2, 7: 8.2, 8: 4.2}
1 two {6: 10.2, 7: 1.2, 8: 3.2}