我正在尝试按“名称”和“站点”对数据框进行分组,我想创建4个新列来查找总和,计数分组依据,“支出”列的平均值和标准偏差。
到目前为止,这是我的代码:
import pandas as pd
df=pd.DataFrame({'Name':['Harry','John','Holly','John','John','John','Holly','Holly','Molly','Molly','Holly','Harry','Harry','Harry'], 'Spend': [76,43,23,43,234,54,34,12,43,54,65,23,12,32],
'Site': ['Amazon','Ikea','Apple','Amazon', 'Apple', 'Ikea', 'Apple', 'Apple', 'Amazon', 'Amazon', 'Ikea', 'Amazon', 'Amazon', 'Ikea']})
print (df)
当前我的数据框如下所示:
我希望它看起来像这样:
我将如何去做?
预先感谢
修改10/11/18:
代码:
import pandas as pd
df=pd.DataFrame({'Name':['Harry','John','Holly','John','John','John','Holly','Holly','Molly','Molly','Holly','Harry','Harry','Harry'], 'Spend': [76,43,23,43,234,54,34,12,43,54,65,23,12,32],
'Site': ['Amazon','Ikea','Apple','Amazon', 'Apple', 'Ikea', 'Apple', 'Apple', 'Amazon', 'Amazon', 'Ikea', 'Amazon', 'Amazon', 'Ikea'], 'Spend2': [176,143,123,143,1234,154,134,112,143,254,365,423,512,632]})
print (df)
之前:
之后:
答案 0 :(得分:3)
df_summary = df.groupby(['Name', 'Site']).agg([np.sum, pd.Series.count, np.mean, np.std])
df_summary.columns = ['Sum', 'Count Groupbys', 'Average', 'Standard Deviation']
df_summary = df_summary.reset_index().sort_values(['Site', 'Name'])
>>> df_summary
Name Site Sum Count Groupbys Average Standard Deviation
0 Harry Amazon 111 3 37.0 34.219877
4 John Amazon 43 1 43.0 NaN
7 Molly Amazon 97 2 48.5 7.778175
2 Holly Apple 69 3 23.0 11.000000
5 John Apple 234 1 234.0 NaN
1 Harry Ikea 32 1 32.0 NaN
3 Holly Ikea 65 1 65.0 NaN
6 John Ikea 97 2 48.5 7.778175
根据您的编辑,您可以通过传递键在列上的字典来使用agg
,这些字典的值是应用于这些列的函数:
df_summary = df.groupby(['Name', 'Site']).agg(
{'Spend': [np.sum, pd.Series.count],
'Spend2': [np.mean, np.std]}
)
df_summary.columns = ['Sum_Spend', 'CountGroupbys_Spend', 'Average_Spend2', 'Standard_Deviation_Spend2']
df_summary = df_summary.reset_index().sort_values(['Site', 'Name'])
>>> df_summary
Name Site Sum_Spend CountGroupbys_Spend Average_Spend2 Standard_Deviation_Spend2
0 Harry Amazon 111 3 370.333333 174.081399
4 John Amazon 43 1 143.000000 NaN
7 Molly Amazon 97 2 198.500000 78.488853
2 Holly Apple 69 3 123.000000 11.000000
5 John Apple 234 1 1234.000000 NaN
1 Harry Ikea 32 1 632.000000 NaN
3 Holly Ikea 65 1 365.000000 NaN
6 John Ikea 97 2 148.500000 7.778175