我将它作为csv在熊猫中工作-前十行:
yearmonth MV Quantiles pead Quantiles ret_f0f1 rf
0 198301 0.2 0.8 -0.071429 0.0069
1 198301 0.2 0.2 0.010638 0.0069
2 198301 0.2 0.2 0.180328 0.0069
3 198301 0.4 0.2 0.042553 0.0069
4 198301 0.4 0.8 0.000000 0.0069
5 198301 0.2 0.8 0.244889 0.0069
6 198301 0.6 0.8 0.071429 0.0069
7 198301 0.6 1.0 -0.025974 0.0069
8 198301 0.2 0.8 0.097222 0.0069
9 198301 1.0 1.0 0.191489 0.0069
想按['yearmonth','MV Quantiles',“ pead Quantiles”]分组并创建一列'excess_return',其中包含从列'ret_f0f1减去列'rf'的结果:
yearmonth MV Quantiles ... rf excess_return
0 198301 0.2 ... 0.0069 -0.071429
1 198301 0.2 ... 0.0069 0.010638
2 198301 0.2 ... 0.0069 0.180328
3 198301 0.2 ... 0.0069 0.042553
4 198301 0.2 ... 0.0069 0.000000
5 198301 0.2 ... 0.0069 0.244889
6 198301 0.2 ... 0.0069 0.071429
7 198301 0.2 ... 0.0069 -0.025974
8 198301 0.2 ... 0.0069 0.097222
9 198301 0.2 ... 0.0069 0.191489
我尝试了转换lambda,但无法完全调用列进行减法。好像是因为x是系列而不是df?其他想法正在使用x.diff。
df2.groupby(['yearmonth','MV Quantiles',"pead Quantiles"])['ret_f0f1','rf'].transform(lambda x: x['ret_f0f1']-x['rf'])
现在,我正在使用:
df2[df2['pead Quantiles']==1.0].groupby(['yearmonth','MV Quantiles','pead Quantiles'])['ret_f0f1'].mean().reset_index()['ret_f0f1']-df2[df2['pead Quantiles']==0.2].groupby(['yearmonth','MV Quantiles','pead Quantiles'])['ret_f0f1'].mean().reset_index()['ret_f0f1']