我有以下数据集,其中值表示返回。
date SP500 MEUR
1/3/1973 0.0049 0.003078
1/4/1973 -0.001194 0.003365
1/5/1973 0.004882 0.004439
1/8/1973 -0.000198 0.000196
1/9/1973 0 -0.000295
1/10/1973 -0.001983 0.000884
1/11/1973 0.007949 -0.000785
1/12/1973 -0.007394 0.003634
我想计算上面每列的3天,10天,15天正向返回(这里只显示2个),然后将这些列中的每一列作为SP500.3day SP5500添加到上表中。 10天
有没有办法不做两个嵌套循环?输出应如下所示:
date SP500 MEUR SP500_3day MEUR_3day
1/3/1973 0.0049 0.003078 0.003483442 0.00801647
1/4/1973 -0.001194 0.003365 0.004683033 0.004339502
1/5/1973 0.004882 0.004439 -0.002180607 0.000784855
1/8/1973 -0.000198 0.000196 0.005950237 -0.000196723
1/9/1973 0 -0.000295 -0.001487759 0.003732663
1/10/1973 -0.001983 0.000884
1/11/1973 0.007949 -0.000785
1/12/1973 -0.007394 0.003634
例如,
1/3/1973: SP500.3day = ((1+ -0.001194)(1+0.004882)(1+-0.000198))-1
答案 0 :(得分:1)
这里的循环确实没什么不好(或慢)。它主要是迭代你想要避免的行。所以下面的代码利用了pandas矢量化操作,应该足够快。请注意,假设所有值都是退货,而不是价格。
np.random.seed(123)
df = pd.DataFrame({ 'sp500':(abs(np.random.randn(12)/10)),
'meur' :(abs(np.random.randn(12)/10)), },
index=pd.date_range('1-3-1973', periods=12, freq='B') )
for i in [3,10]:
for c in ['meur','sp500']:
df[c+'_'+str(i)] = df[c].rolling(i).apply( lambda x: (x+1).prod() - 1 ).shift(-i)
meur sp500 meur_3 sp500_3 meur_10 sp500_10
1973-01-03 0.149139 0.108563 0.159387 0.301195 1.659468 1.717294
1973-01-04 0.063890 0.099735 0.330156 0.251649 1.793686 1.494265
1973-01-05 0.044398 0.028298 0.552121 0.418219 NaN NaN
1973-01-08 0.043435 0.150629 0.636865 0.531661 NaN NaN
1973-01-09 0.220593 0.057860 0.392830 0.509988 NaN NaN
1973-01-10 0.218679 0.165144 0.227176 0.460029 NaN NaN
1973-01-11 0.100405 0.242668 0.281450 0.276749 NaN NaN
1973-01-12 0.038619 0.042891 0.349266 0.307352 NaN NaN
1973-01-15 0.073737 0.126594 0.404363 0.171437 NaN NaN
1973-01-16 0.149073 0.086674 NaN NaN NaN NaN
1973-01-17 0.093583 0.067889 NaN NaN NaN NaN
1973-01-18 0.117583 0.009471 NaN NaN NaN NaN