我试图在数据框中填充列(信号),条件是数据帧中的另一列(diff)与2个变量进行比较。要填写的此列有3个可能的结果,1,-1,0表示买入,卖出,持有(封面)。到目前为止,这是代码和输出。
import numpy as np
import Quandl
tlm = Quandl.get("GOOG/NYSE_TLM", trim_start="2014-12-01", trim_end="2015-01-01")
tlm['diff'] = (tlm.Open - tlm.Close.shift(1))/tlm.Close.shift(1) # lags data
lowerbound = -0.08
upperbound = 0.08
tlm['signal'] = np.where(tlm['diff'] >= upperbound, 1.0, 0.0)
tlm['signal'] = np.where(tlm['diff'] <= lowerbound, -1.0, 0.0)
print(tlm.head(20)) # is dataframe
Open High Low Close Volume diff signal
Date
2014-12-01 4.91 4.93 4.53 4.53 12999427 NaN 0
2014-12-02 4.62 4.82 4.47 4.64 8015450 0.019868 0
2014-12-03 4.51 4.83 4.48 4.63 9175510 -0.028017 0
2014-12-04 4.59 4.62 4.04 4.05 16065766 -0.008639 0
2014-12-05 4.05 4.09 3.86 3.94 8783581 0.000000 0
2014-12-08 3.88 4.04 3.46 3.74 17497626 -0.015228 0
2014-12-09 4.09 4.36 4.04 4.22 12559347 0.093583 0
2014-12-10 4.20 4.20 3.67 3.79 12403674 -0.004739 0
2014-12-11 3.74 3.95 3.67 3.69 9396960 -0.013193 0
2014-12-12 5.05 5.24 4.17 4.29 75949020 0.368564 0
2014-12-15 5.33 5.35 4.99 5.12 38834129 0.242424 0
2014-12-16 7.47 7.60 7.46 7.58 282795097 0.458984 0
2014-12-17 7.59 7.66 7.55 7.64 73152687 0.001319 0
2014-12-18 7.68 7.82 7.66 7.78 55387941 0.005236 0
2014-12-19 7.77 7.89 7.77 7.85 31330786 -0.001285 0
2014-12-22 7.82 7.85 7.78 7.79 22758351 -0.003822 0
2014-12-23 7.79 7.88 7.79 7.84 19068732 0.000000 0
2014-12-24 7.83 7.86 7.82 7.84 9174813 -0.001276 0
2014-12-26 7.84 7.86 7.82 7.85 9717732 0.000000 0
2014-12-29 7.84 7.86 7.81 7.83 12035787 -0.001274 0
上面代码的问题是打印之前的行覆盖了前一行工作正常,你会在适当的信号列中看到1。所以我不得不为条件转到for循环,但是我在循环中得到了一个Value错误。我有点理解布尔比较问题与Numpy数组有关,但是如果我不能比较条件,我将如何生成3个条件(1,-1,0)?
for index, row in tlm.iterrows():
if tlm['diff'] >= upperbound: # value error here
tlm['signal'] = 1.0
if tlm['diff'] <= lowerbound:
tlm['signal'] = -1.0
else:
tlm['signal'] = 0.0
是代码和熊猫的新手。提前谢谢!
答案 0 :(得分:0)
您可以使用np.select:
conditions = [tlm['diff'] >= upperbound,
tlm['diff'] <= lowerbound]
choices = [1, -1]
tlm['signal'] = np.select(conditions, choices, default=0)
或等同地,但不是可读的:
tlm['signal'] = np.where(tlm['diff'] >= upperbound, 1.0,
np.where(tlm['diff'] <= lowerbound, -1.0, 0.0))