在DataFrame上应用groupby以显示累积统计信息

时间:2017-08-24 19:36:43

标签: python pandas dataframe group-by pandas-groupby

假设我有一个如下所示的DataFrame:

Bank Name     House     This Wk
Barc          Germany   100
Barc          UK        300
Barc          UK        500
JPM           Japan     200
JPM           NYC       100
BOA           LA        900
BOA           LA        50
BOA           LA        50
DB            Italy     45

我想按银行名称分组,同时输出最大的房屋价值以及总价值......

例如,使用上面的示例将导致:

Bank Name     Total     House     This Wk
Barc          900       UK        500
JPM           300       Japan     200
BOA           1000      LA        900
DB            45        Italy     45

基本上,它是按照银行名称对Total进行分组,但也会将最大的贡献者House输出到总数,贡献的金额为This Wk

我该怎么做呢?

3 个答案:

答案 0 :(得分:5)

In [121]: df.groupby('Bank Name', group_keys=False) \
     ...:   .apply(lambda x: x.nlargest(1, 'This Wk').assign(Total=x['This Wk'].sum())) \
     ...:   [['Bank Name','Total','House','This Wk']]
     ...:
Out[121]:
  Bank Name  Total  House  This Wk
5       BOA   1000     LA      900
2      Barc    900     UK      500
8        DB     45  Italy       45
3       JPM    300  Japan      200

答案 1 :(得分:3)

您可以使用df.groupby函数列表来考虑dfGroupBy.agg

In [732]: out = df.groupby('Bank Name')['This Wk'].agg(['sum', 'idxmax', 'max'])\
               .rename(columns={'sum' : 'Total', 'idxmax' : 'House', 'max' : 'This Wk'})\
               .reset_index()


In [734]: out['House'] = df.loc[out['House'], 'House'].values; out
Out[734]: 
  Bank Name  Total  House  This Wk
0       BOA   1000     LA      900
1      Barc    900     UK      500
2        DB     45  Italy       45
3       JPM    300  Japan      200

答案 2 :(得分:0)

使用apply的另一种方式是

In [17]: (df.groupby('Bank Name', sort=False)
            .apply(lambda x: pd.Series(
                             [x['This Wk'].sum(), 
                              x.loc[x['This Wk'].idxmax(), 'House'], 
                              x['This Wk'].max()], 
                   index=['Total', 'House', 'This Wk']))
            .reset_index())
Out[17]:
  Bank Name  Total  House  This Wk
0      Barc    900     UK      500
1       JPM    300  Japan      200
2       BOA   1000     LA      900
3        DB     45  Italy       45