我想使用机器学习算法测试日间交易逻辑。
我写了一个简单的代码,但没有工作,所以我需要你的帮助。
这就是:
逻辑
概念:在开盘当天买入S& P 500指数,在收盘日退出。
自变量:dayOpen/dayOpen(1)
,dayOpen/dayClose(1)
,dayOpen/dayHigh(1)
,dayOpen/dayLow(1)
....... dayOpen/dayLow(2)
----> dayOpen(1)
=前一天的开盘价
从属变量:if Close > Open
分配1,否则为0
交易逻辑:使用具有8个功能的kNN
或SVM
算法,预测今天的索引方向。如果上涨,买入指数,否则持有现金。
我尝试用最近的60个滚动数据训练测试数据。
我的代码如下所示,但我收到了错误消息。有人可以帮帮我吗?
由于
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import pandas_datareader.data as web
from sklearn import neighbors, svm
def price(stock, start):
price = web.DataReader(name=stock, data_source='yahoo', start=start)
return price
clf1 = neighbors.KNeighborsClassifier(n_neighbors=3)
clf2 = svm.SVC()
data = price('^KS11','2000-01-01')
data['Open(1)'] = data.Open/data.Open.shift(1)
data['High(1)'] = data.Open/data.High.shift(1)
data['Low(1)'] = data.Open/data.Low.shift(1)
data['Close(1)'] = data.Open/data.Close.shift(1)
data['Open(2)'] = data.Open/data.Open.shift(2)
data['High(2)'] = data.Open/data.High.shift(2)
data['Low(1)'] = data.Open/data.Low.shift(2)
data['Close(2)'] = data.Open/data.Close.shift(2)
data['Result'] = np.where(data.Close/data.Open>1,1,0)
data['Close/Open'] = data.Close/data.Open
data['Prediction'] = pd.Series()
for i in range(61, len(data.index)):
x = data.iloc[i-61:i,6:14]
y = data.Result[i-61:i]
clf2.fit(x,y)
data.Prediction[i] = clf2.predict(x)[-1]
data = data.dropna()
data['Profit'] = np.where(data.Result == 1,data['Close/Open'],1).cumprod()
print(data.iloc[:,6:])
data.Profit.plot()
plt.show()
ValueError Traceback (most recent call last) <ipython-input-8-8c10864c79e2> in <module>() 29 x = data.iloc[i-61:i,6:14] 30 y = data.Result[i-61:i] ---> 31 clf2.fit(x,y) 32 data.Prediction[i] = clf2.predict(x)[-1] 33 C:\Users\yiugn_\Anaconda3\lib\site-packages\sklearn\svm\base.py in fit(self, X, y, sample_weight) 148 self._sparse = sparse and not callable(self.kernel) 149 --> 150 X = check_array(X, accept_sparse='csr', dtype=np.float64, order='C') 151 y = self._validate_targets(y) 152 C:\Users\yiugn_\Anaconda3\lib\site-packages\sklearn\utils\validation.py in check_array(array, accept_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)`enter code here` 396 % (array.ndim, estimator_name)) 397 if force_all_finite: --> 398 _assert_all_finite(array) 399 400 shape_repr = _shape_repr(array.shape) C:\Users\yiugn_\Anaconda3\lib\site-packages\sklearn\utils\validation.py in _assert_all_finite(X) 52 and not np.isfinite(X).all()): 53 raise ValueError("Input contains NaN, infinity" ---> 54 " or a value too large for %r." % X.dtype) 55 56 ValueError: Input contains NaN, infinity or a value too large for dtype('float64').