硬件问题:考虑这种投资策略:每当价格超过50天移动平均线时就买入,然后在3个交易日后卖出。我们平均可以赚多少利润(以百分比为单位)?在第x个交易日,我们说,如果(1)价格低于交易日x-1的移动平均线,并且(2)价格高于交易日的移动平均线,则价格“高于” 50天移动平均线第x天。
rol=stock.rolling(50).mean()
profitMade=((stock.shift(-3)-stock)/stock)
stock>rol
profitMade[(stock<stock.shift(-1))&(stock>rol)]
profitMade.pct_change()
profitMade[profitMade.pct_change()].mean()
最后一行返回“ nan”是期望的值
样本数据:
Date
2002-05-23 1.196429
2002-05-24 1.210000
2002-05-28 1.157143
2002-05-29 1.103571
2002-05-30 1.071429
2002-05-31 1.076429
2002-06-03 1.128571
2002-06-04 1.117857
2002-06-05 1.147143
2002-06-06 1.182143
2002-06-07 1.118571
2002-06-10 1.156429
2002-06-11 1.153571
2002-06-12 1.092857
2002-06-13 1.082857
2002-06-14 0.986429
2002-06-17 0.922143
2002-06-18 0.910714
2002-06-19 0.951429
2002-06-20 0.957143
2002-06-21 0.979286
2002-06-24 0.978571
2002-06-25 0.964286
2002-06-26 0.988571
2002-06-27 0.943571
2002-06-28 0.999286
2002-07-01 1.027857
2002-07-02 1.172857
2002-07-03 1.214286
2002-07-05 1.276429
答案 0 :(得分:0)
看看rol
的值,它全都是NaN-
rol = stock.rolling(50).mean()
rol
Out:
value
Date
2002-05-23 NaN
2002-05-24 NaN
2002-05-28 NaN
2002-05-29 NaN
2002-05-30 NaN
2002-05-31 NaN
2002-06-03 NaN
2002-06-04 NaN
2002-06-05 NaN
2002-06-06 NaN
2002-06-07 NaN
2002-06-10 NaN
2002-06-11 NaN
2002-06-12 NaN
2002-06-13 NaN
2002-06-14 NaN
2002-06-17 NaN
2002-06-18 NaN
2002-06-19 NaN
2002-06-20 NaN
2002-06-21 NaN
2002-06-24 NaN
2002-06-25 NaN
2002-06-26 NaN
2002-06-27 NaN
2002-06-28 NaN
2002-07-01 NaN
2002-07-02 NaN
2002-07-03 NaN
2002-07-05 NaN
滚动时,它将使用大小为50的窗口来捕获值。默认情况下,边缘窗口捕获的值少于要求的值,并用NaN
填充/在您的情况下,窗口的大小远大于DataFrame的大小-因此,所有值均设置为NaN
要证明这一概念,请查看较小的窗口尺寸:
rol = stock.rolling(20).mean()
print(rol)
Out:
value
Date
2002-05-23 NaN
2002-05-24 NaN
2002-05-28 NaN
2002-05-29 NaN
2002-05-30 NaN
2002-05-31 NaN
2002-06-03 NaN
2002-06-04 NaN
2002-06-05 NaN
2002-06-06 NaN
2002-06-07 NaN
2002-06-10 NaN
2002-06-11 NaN
2002-06-12 NaN
2002-06-13 NaN
2002-06-14 NaN
2002-06-17 NaN
2002-06-18 NaN
2002-06-19 NaN
2002-06-20 1.086143
2002-06-21 1.075286
2002-06-24 1.063714
2002-06-25 1.054071
2002-06-26 1.048321
2002-06-27 1.041929
2002-06-28 1.038071
2002-07-01 1.033036
2002-07-02 1.035786
2002-07-03 1.039143
2002-07-05 1.043857
-第一个非NaN值是二十分之一。
为避免此行为,可以为min_period
的{{1}}参数提供一个值:
rolling
-因此,如果元素少于窗口大小,则滚动将按提供的数量进行。
关于rol = stock.rolling(50, min_periods=1).mean()
print(rol)
Out:
value
Date
2002-05-23 1.196429
2002-05-24 1.203215
2002-05-28 1.187857
2002-05-29 1.166786
2002-05-30 1.147714
2002-05-31 1.135834
2002-06-03 1.134796
2002-06-04 1.132679
2002-06-05 1.134286
2002-06-06 1.139072
2002-06-07 1.137208
2002-06-10 1.138810
2002-06-11 1.139945
2002-06-12 1.136582
2002-06-13 1.133000
2002-06-14 1.123839
2002-06-17 1.111975
2002-06-18 1.100794
2002-06-19 1.092932
2002-06-20 1.086143
2002-06-21 1.081054
2002-06-24 1.076396
2002-06-25 1.071522
2002-06-26 1.068065
2002-06-27 1.063086
2002-06-28 1.060632
2002-07-01 1.059418
2002-07-02 1.063469
2002-07-03 1.068670
2002-07-05 1.075595
的文档:
min_periods:int,默认为无
窗口中具有值的最小观察数
(否则结果为NA)。对于由偏移量指定的窗口,
默认为1。
在下面的行中,您“松散”了最后三个值,将它们设置为NaN:
min_periods
-所以,我想,您应该删除它(可能是我错了,因为我对这个特定任务不太熟悉)。然后重新索引profitMade = ((stock.shift(-3) - stock)/stock)
profitMade
Out:
...
2002-07-01 1.276429
2002-07-02 NaN
2002-07-03 NaN
2002-07-05 NaN
和stock
,因为进一步操作需要相同的大小。
rol
好,有三个大小相等的表。我更改了一行,该行返回了一个充满NaN的表
profitMade = profitMade.dropna()
stock = stock.loc[profitMade.index]
rol = rol.loc[profitMade.index]
到
profitMade[(stock<stock.shift(-1))&(stock>rol)]
Out:
value
Date
2002-05-23 NaN
2002-05-24 NaN
2002-05-28 NaN
2002-05-29 NaN
2002-05-30 NaN
2002-05-31 NaN
2002-06-03 NaN
2002-06-04 NaN
2002-06-05 0.008095
2002-06-06 NaN
2002-06-07 NaN
2002-06-10 NaN
-处理特定列并删除NaN。
此外,我不知道您在这里做什么:
profitMade[(stock['value'] < stock['value'].shift(-1)) & (stock['value'] > rol['value'])]
Out:
value
Date
2002-06-05 0.008095
-profitMade[profitMade.pct_change()].mean()
返回一个表,其中包含profitMade.pct_change()
个值的表(虚拟百分比),但是float
希望使用布尔对象-您应澄清并编辑问题。
完整代码:
profitMade[...]