问题:
我正在尝试使用Keras中的LSTM模型预测Google的未来股价。我能够成功地训练模型,并且测试预测也进行得很好,但是后期测试/未来预测很差。它形成了一条稳定下降的曲线,这不是实际的未来数据。
一些说明
我正在用两个输入训练模型,并期望从中获得单个输出。
# Feature Scaling
from sklearn.preprocessing import MinMaxScaler
sc = MinMaxScaler(feature_range = (0, 1))
training_set_scaled = sc.fit_transform(training_set)
# Creating a data structure with 60 timesteps and 1 output
X_train = []
y_train = []
for i in range(2, 999):
X_train.append(training_set_scaled[i-2:i, 0])
y_train.append(training_set_scaled[i, 0])
X_train, y_train = np.array(X_train), np.array(y_train)
# Reshaping
X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))
# Part 2 - Building the RNN
# Importing the Keras libraries and packages
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Dropout
# Initialising the RNN
regressor = Sequential()
# Adding the first LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50, return_sequences = True, input_shape = (X_train.shape[1], 1)))
regressor.add(Dropout(0.2))
# Adding a second LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))
# Adding a third LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))
# Adding a fourth LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50))
regressor.add(Dropout(0.2))
# Adding the output layer
regressor.add(Dense(units = 1))
# Compiling the RNN
regressor.compile(optimizer = 'rmsprop', loss = 'mean_squared_error')
# Fitting the RNN to the Training set
regressor.fit(X_train, y_train, epochs = 500, batch_size = 50)
测试预测模型
dataset_test = pd.read_csv('/media/vinothubuntu/Ubuntu Storage/Downloads/Test - Test.csv')
real_stock_price = dataset_test.iloc[:, 2:3].values
# Getting the predicted stock price of 2017
dataset_total = pd.concat((dataset_train['data'], dataset_test['data']), axis = 0)
inputs = dataset_total[len(dataset_total) - len(dataset_test) -0:].values
inputs = inputs.reshape(-1,1)
inputs = sc.transform(inputs)
X_test = []
test_var = []
for i in range(0, 28):
X_test.append(inputs[i:i+2, 0])
test_var.append(inputs[i, 0])
X_test_pred = np.array(X_test)
X_test_pred = np.reshape(X_test_pred, (X_test_pred.shape[0], X_test_pred.shape[1], 1))
predicted_stock_price = regressor.predict(X_test_pred)
这部分进行得很好,测试预测给出了理想的结果。
测试/未来预测后:
for x in range(0,30):
X_test_length = X_test[len(X_test)-1] # get the last array of X_test list
future=[]
Prev_4 = X_test_length[1:2] # get the last four value of the X_test_length
Last_pred = predicted_stock_price.flat[-1] # get the last value from prediction
merger = np.append(Prev_4,Last_pred)
X_test.append(merger) #append the new array to X_test
future.append(merger) #append the new array to future array
one_time_pred=np.array(future)
one_time_pred = np.reshape(one_time_pred, (one_time_pred.shape[0], one_time_pred.shape[1], 1))
future_prediction = regressor.predict(one_time_pred) #predict future - gives one new prediction
predicted_stock_price = np.append(predicted_stock_price, future_prediction, axis=0) #put the new predicction on predicted_stock_price array
这是实际的问题,我正在从测试预测中获取最后一个值,并预测单个输出,并在新的预设值上创建一个循环。 [如果您觉得这不是一个好主意,请提出一种更好的方法给我。]
我的输出:
预期结果:未来的实际数据,绝对不是下降曲线。