如何在大熊猫中找到groupby总数的百分比

时间:2019-03-01 08:14:08

标签: python pandas

我在熊猫中有以下数据框

  Date        tank     hose     quantity     count      set     flow
  01-01-2018  1        1        20           100        211     12.32
  01-01-2018  1        2        20           200        111     22.32
  01-01-2018  1        3        20           200        123     42.32
  02-01-2018  1        1        10           100        211     12.32
  02-01-2018  1        2        10           200        111     22.32
  02-01-2018  1        3        10           200        123     42.32

我想按quantitycount计算Datetank分组的百分比。我想要的数据框

  Date        tank   hose   quantity   count   set   flow    perc_quant  perc_count
  01-01-2018  1        1    20         100     211   12.32   33.33       20
  01-01-2018  1        2    20         200     111   22.32   33.33       40
  01-01-2018  1        3    20         200     123   42.32   33.33       40
  02-01-2018  1        1    10         100     211   12.32   25          20
  02-01-2018  1        2    20         200     111   22.32   50          40
  02-01-2018  1        3    10         200     123   42.32   25          40

我正在为实现这一目标而努力

   test = df.groupby(['Date','tank']).apply(lambda x:
                                             100 * x / float(x.sum()))

1 个答案:

答案 0 :(得分:3)

GroupBy.transform和lambda函数一起使用,将add_prefixjoin用作原始字符:

f = lambda x: 100 * x / float(x.sum())
df = df.join(df.groupby(['Date','tank'])['quantity','count'].transform(f).add_prefix('perc_'))

或指定新的列名称:

df[['perc_quantity','perc_count']] = (df.groupby(['Date','tank'])['quantity','count']
                                        .transform(f))

print (df)
         Date  tank  hose  quantity  count  set   flow  perc_quantity  \
0  01-01-2018     1     1        20    100  211  12.32      33.333333   
1  01-01-2018     1     2        20    200  111  22.32      33.333333   
2  01-01-2018     1     3        20    200  123  42.32      33.333333   
3  02-01-2018     1     1        10    100  211  12.32      33.333333   
4  02-01-2018     1     2        10    200  111  22.32      33.333333   
5  02-01-2018     1     3        10    200  123  42.32      33.333333   

   perc_count  
0        20.0  
1        40.0  
2        40.0  
3        20.0  
4        40.0  
5        40.0