时间序列预测中的数据形状不匹配

时间:2020-07-28 18:03:49

标签: python tensorflow machine-learning keras lstm

我使用LSTM模型预测股票价格,我的每一步都完全按照教程进行,但是与教程不同,我的代码遇到错误。

这是我正在使用的代码:

df = pd.read_csv(f'D:\\algo\\all\\EURUSD_15M.csv')
df = df.loc[:, ~df.columns.str.contains('^Unnamed',)]
training_set = df.iloc[:-int(len(df)/10), 4:5].values

sc = MinMaxScaler(feature_range= (0, 1))
training_set_scaled = sc.fit_transform(training_set)
x_train , y_train = [], []

for i in range(60, len(training_set)):
    x_train.append(training_set_scaled[i-60:i, 0])
    y_train.append(training_set_scaled[i, 0])

x_train, y_train = np.array(x_train), np.array(y_train)
x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))

model  = Sequential()

model.add(LSTM(units=50, return_sequences=True, input_shape = (x_train.shape[1],1)))
model.add(Dropout(0.2))

model.add(LSTM(units=50, return_sequences=True))
model.add(Dropout(0.2))

model.add(LSTM(units=50, return_sequences=True))
model.add(Dropout(0.2))

model.add(LSTM(units=50, return_sequences=True))
model.add(Dropout(0.2))

model.add(Dense(units=1))

model.compile(optimizer='adam', loss='mean_squared_error', metrics=['Accuracy'])
model.fit(x_train, y_train, epochs=5, batch_size=64)

因此,应该发生的事情是花费60个周期的价格并预测第61个周期。

但是我最终会遇到以下错误:

ValueError: A target array with shape (379319, 1) was passed for an output of shape (None, 60, 1) while using as loss `mean_squared_error`. This loss expects targets to have the same shape as the output.

我在做什么错了?

1 个答案:

答案 0 :(得分:0)

作为数据集尝试使用 TimeseriesGenerator(tf.keras.preprocessing.sequence.TimeseriesGenerator)而非您的自定义列表-> https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/sequence/TimeseriesGenerator

例如。

train_gen = TimeseriesGenerator(Xtrain, Xtrain, n_steps, batch_size=24*7)
valid_gen = TimeseriesGenerator(Xvalid, Xvalid, n_steps, batch_size=24*7)
test_gen = TimeseriesGenerator(Xtest, Xtest, n_steps, batch_size=24*7)