我有多个时间序列数据集,这些数据集经过格式化后可以通过以下方式查看:
+-----------------+------------------+
| X | Y |
+-----------------+------------------+
| [0.1, 0.3, 0.4] | [0.5, 0.8, 0.9] |
+-----------------+------------------+
| [0.3, 0.4, 0.5] | [0.8, 0.9, 0.91] |
+-----------------+------------------+
| [0.4, 0.5, 0.8] | [0.9, 0.91, 0.93]|
+-----------------+------------------+
| ... | ... |
+-----------------+------------------+
对于每个数据集,X包含要执行预测的值的向量(预测接下来的3个值),Y包含接下来的3个要预测的值的向量。
我有多个像上面提到的数据集,还有许多要预测的仅包含X部分的数据集。
我的问题是,我该如何适应LSTM网络以便利用所有训练数据集的信息进行训练?我可以一方面将它们的X值连接起来,另一方面可以将Y值连接起来吗?
我有以下代码,但是我不确定它是否实现良好。
# listadx and listady are lists containing the concatenation of X and Y vectors of all the datasets, respectively.
# split into train and test sets
train_size = int(len(listadx) * 0.67)
test_size = len(listadx) - train_size
trainx, testx = listadx[0:train_size,:], listadx[train_size:len(listadx),:]
trainy, testy = listady[0:train_size,:], listady[train_size:len(listady),:]
# reshape input to be [samples, time steps, features]
trainx = numpy.reshape(trainx, (trainx.shape[0], 1, trainx.shape[1]))
testx = numpy.reshape(testx, (testx.shape[0], 1, testx.shape[1]))
# create and fit the LSTM network
model = Sequential()
model.add(LSTM(4, input_shape=(trainx.shape[1], trainx.shape[2])))
model.add(Dense(outputs))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(trainx, trainy, epochs=100, batch_size=1, verbose=2)
# make predictions
trainPredict = model.predict(trainx)
testPredict = model.predict(testx)
trainPredict给我返回一个带有3个值的向量。这些是Y的值吗?