我的目标是使用LSTM模型预测交易信号,0代表买入,1代表持有,2代表卖出,我使用的数据集是台湾股票,称为“ 0050”。我有3700天的“ 0050”收盘价和交易信号。 数据集示例:
signal trend(close) #close is normalized
0 1 -0.183731
1 1 -0.189234
3 1 -0.204875
4 2 -0.206758
5 1 -0.205889
. . .
. . .
. . .
3697 0 0.533720
3698 0 0.535893
3699 1 0.527203
我想用30天来预测未来1天,并且输出必须是0(买入)或1(持有)或2(卖出)。
以下是我的代码:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import tensorflow as tf
import keras
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import LabelBinarizer
#================================================
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Dropout
dataset= pd.read_csv('0050.csv',usecols=[3])
dataset_norm = dataset.apply(lambda x: (x - np.mean(x)) / (np.max(x) - np.min(x)))
count = int(np.ceil(len(dataset) * 0.1))
signals = pd.DataFrame(index=dataset.index)
signals['signal'] = 1
signals['trend'] = dataset_norm['open']
signals['RollingMax'] = (signals.trend.shift(1).rolling(count).max())
signals['RollingMin'] = (signals.trend.shift(1).rolling(count).min())
signals.loc[signals['RollingMax'] < signals.trend, 'signal'] = 0
signals.loc[signals['RollingMin'] > signals.trend, 'signal'] = 2
pastDay= 30
futureDay= 1
X_train, Y_train = [], []
for i in range(len(signals)-futureDay-pastDay): #for i in(60,1000)
X_train.append(np.array(signals.loc[i:i+pastDay-1,['trend']]))
Y_train.append(np.array(signals.loc[i+pastDay-1+futureDay,['signal']]))
X_train,Y_train= np.array(X_train),np.array(Y_train)
lb = LabelBinarizer()
Y_train = lb.fit_transform(Y_train)
print("X_train shape: {}".format(X_train.shape))
print("Y_train shape: {}".format(Y_train.shape))
regressor = Sequential()
regressor.add(LSTM(units = 256, return_sequences = True, input_length = X_train.shape[1], input_dim = X_train.shape[2]))
regressor.add(Dropout(0.2))
regressor.add(LSTM(units = 256, return_sequences = True))
regressor.add(Dropout(0.2))
regressor.add(LSTM(units = 256, return_sequences = True))
regressor.add(Dropout(0.2))
regressor.add(LSTM(units = 256, return_sequences = True))
regressor.add(Dropout(0.2))
regressor.add(LSTM(units = 256, return_sequences = True))
regressor.add(Dropout(0.2))
regressor.add(LSTM(units = 256))
regressor.add(Dropout(0.2))
regressor.add(Dense(3,activation='softmax'))
# Compiling the RNN
regressor.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics=['accuracy'])
regressor.fit(X_train, Y_train, epochs = 5, batch_size = 128)
print(list(regressor.predict(X_train)))
当我使用X_train查看输出时,我总是这样获得输出:
[array([0.33128914, 0.40950283, 0.25920808], dtype=float32),
array([0.33128476, 0.4095166 , 0.25919858], dtype=float32),
array([0.33127722, 0.4095325 , 0.25919032], dtype=float32),
array([0.33126664, 0.4095487 , 0.25918463], dtype=float32)......]
我们可以看到持有股票的概率总是最高的,这意味着我们不需要买卖,但是我不想要这种输出。我想有时买入有时卖出,我该如何编辑代码以达到我的目标?