我有一个这样的数据框:
df = pd.DataFrame({'country': ['usa','canada','usa','canada','mexico','usa'],
'color': ['silver','brown','brown','black','silver','black'],
'car': ['honda','honda','nissan','toyota','honda','toyota'],
'value': range(60,66)})
car color country value
0 honda silver usa 60
1 honda brown canada 61
2 nissan brown usa 62
3 toyota black canada 63
4 honda silver mexico 64
5 toyota black usa 65
I can pivot by two indices like this:
df.pivot_table(index=['color','car'], columns='country', values='value')\
.rename_axis(None, axis=1).reset_index()
color car canada mexico usa
0 black toyota 63.0 NaN 65.0
1 brown honda 61.0 NaN NaN
2 brown nissan NaN NaN 62.0
3 silver honda NaN 64.0 60.0
我想知道如何使用GROUPBY获得相同的结果?
感谢您的帮助。
答案 0 :(得分:1)
按颜色,汽车和国家/地区对数据框进行分组,然后找到列值的平均值。 unstack和reset_index。
new_df = df.groupby(['color', 'car', 'country']).value.mean().unstack().reset_index()
new_df.columns.name = None
color car canada mexico usa
0 black toyota 63.0 NaN 65.0
1 brown honda 61.0 NaN NaN
2 brown nissan NaN NaN 62.0
3 silver honda NaN 64.0 60.0