我有以下数据框。
weight x value
0 5 -8.7 2
1 9 -8.7 3
2 12 -21.4 10
3 32 -21.4 15
我需要得到该值的加权平均值并按x分组。结果将是:
-8.7:(5 /(5 + 9)* 2)+((9/14)* 3)= 2.64
-21.4:((12/44)* 10)+((32/44)* 15)= 13.63
x weighted_value
0 -8.7 2.64
1 -21.4 13.63
答案 0 :(得分:1)
numpy.average
允许weights
参数:
import io
import numpy as np
import pandas as pd
data = io.StringIO('''\
weight x value
0 5 -8.7 2
1 9 -8.7 3
2 12 -21.4 10
3 32 -21.4 15
''')
df = pd.read_csv(data, delim_whitespace=True)
df.groupby('x').apply(lambda g: np.average(g['value'], weights=g['weight']))
输出:
x
-21.4 13.636364
-8.7 2.642857
dtype: float64
答案 1 :(得分:0)
这是使用NumPy工具的矢量化方法 -
# Get weighted averages and corresponding unique x's
unq,ids = np.unique(df.x,return_inverse=True)
weight_avg = np.bincount(ids,df.weight*df.value)/np.bincount(ids,df.weight)
# Store into a dataframe
df_out = pd.DataFrame(np.column_stack((unq,weight_avg)),columns=['x','wghts'])
示例运行 -
In [97]: df
Out[97]:
weight x value
0 5 -8.7 2
1 9 -8.7 3
2 12 -21.4 10
3 32 -21.4 15
In [98]: df_out
Out[98]:
x wghts
0 -21.4 13.636364
1 -8.7 2.642857