我们说我有以下数据框:
State Type Denominator payment1 payment2 payment3
State1 A 40 1000 8000 1000
State1 A 50 5000 4000 5000
State1 B 70 6000 5000 600
State1 B 30 4000 7000 5000
State1 C 50 1000 5000 8000
State2 A 60 5000 7000 4000
State2 A 40 600 5000 5000
State2 B 50 400 6000 7000
State2 C 80 8000 4000 1000
State2 C 40 4000 8000 5000
State3 A 60 5000 4000 600
State3 A 50 7000 5000 400
State3 B 70 5000 7000 5000
State3 B 20 6000 4000 7000
State3 C 50 2000 7000 1000
我想计算按州和类型分组的3个付款列的加权平均值,使用分母作为权重。
在R上,我会成功地使用它:
health2=health%>%group_by(State, Type)%>%summarise(payment1mean=weighted.mean(payment1, Denominator),payment2mean=weighted.mean(payment2, Denominator),payment3mean=weighted.mean(payment3, Denominator))
我如何才能在Python上做同样的事情?
我尝试了类似下面的内容,但没有成功。
paycolumns=['payment1','payment2','payment3']
datagrouped=data.groupby(['State','Type'],as_index=False)[paycolumns].apply(lambda x: np.average(x, weights=data.Denominator)