获得加权平均值然后在大熊猫中分组

时间:2016-09-23 15:15:20

标签: python pandas numpy

我有以下数据框。

      weight       x     value
0          5     -8.7        2
1          9     -8.7        3
2         12    -21.4       10
3         32    -21.4       15

我需要得到该值的加权平均值并按x分组。结果将是:

-8.7:(5 /(5 + 9)* 2)+((9/14)* 3)= 2.64

-21.4:((12/44)* 10)+((32/44)* 15)= 13.63

         x     weighted_value
0     -8.7               2.64
1    -21.4              13.63

2 个答案:

答案 0 :(得分:1)

numpy.average允许weights参数:

import io
import numpy as np
import pandas as pd

data = io.StringIO('''\
      weight       x     value
0          5     -8.7        2
1          9     -8.7        3
2         12    -21.4       10
3         32    -21.4       15
''')
df = pd.read_csv(data, delim_whitespace=True)

df.groupby('x').apply(lambda g: np.average(g['value'], weights=g['weight']))

输出:

x
-21.4    13.636364
-8.7      2.642857
dtype: float64

答案 1 :(得分:0)

这是使用NumPy工具的矢量化方法 -

# Get weighted averages and corresponding unique x's
unq,ids = np.unique(df.x,return_inverse=True)
weight_avg = np.bincount(ids,df.weight*df.value)/np.bincount(ids,df.weight)

# Store into a dataframe
df_out = pd.DataFrame(np.column_stack((unq,weight_avg)),columns=['x','wghts'])

示例运行 -

In [97]: df
Out[97]: 
   weight     x  value
0       5  -8.7      2
1       9  -8.7      3
2      12 -21.4     10
3      32 -21.4     15

In [98]: df_out
Out[98]: 
      x      wghts
0 -21.4  13.636364
1  -8.7   2.642857