这是我的问题(我想这很普遍):
我有4个时间序列(x1,x2,x3,x4),并且基于'd'历史数据
[(x_1(td),x_2(td),x_3(td),x_4(td),...,(x_1(t-1),x_2(t-1),x_3(t-1),x_4( t-1)]
我想预测[x_1(t),x_1(t + 1)]
因此,在加载完整且没有NaN的数据之后,我首先使用Scikit-Learn MinMaxScaler(feature_range =(0,1)来重新缩放它们。
然后,我将它们分为训练和测试集,并使用Keras TimeSeries方法,batch_size = 72
train_gen = TimeseriesGenerator(data_train, target_train,
start_index=1,
length=n_lags, sampling_rate=1,
batch_size=batch_size)
test_gen = TimeseriesGenerator(data_test, target_test,
start_index=1,
length=n_lags, sampling_rate=1,
batch_size=batch_size)
火车的形状(输入,目标)是
每批火车X,y形状(72、10、4)(72、2)
测试也一样
每批测试X,y形状(72、10、4)(72、2)
例如,这是第一批的第一个输入数据(train_gen [0] [0] [:3]):
array([[[0.28665611, 0.63705857, 0.32643516, 0.45493102],
[0.26487018, 0.6301432 , 0.30965767, 0.45791034],
[0.25228031, 0.61725465, 0.3161332 , 0.45023995],
[0.24793654, 0.58854431, 0.32644507, 0.43765143],
[0.25025404, 0.55537186, 0.33264606, 0.42989095],
[0.25923045, 0.53953228, 0.32621582, 0.43297785],
[0.27078601, 0.53333689, 0.31391997, 0.4531239 ],
[0.28204362, 0.55253638, 0.30399583, 0.48110336],
[0.2905511 , 0.59113979, 0.29693304, 0.50782682],
[0.29877746, 0.65041821, 0.28764287, 0.53247815]],
[[0.26487018, 0.6301432 , 0.30965767, 0.45791034],
[0.25228031, 0.61725465, 0.3161332 , 0.45023995],
[0.24793654, 0.58854431, 0.32644507, 0.43765143],
[0.25025404, 0.55537186, 0.33264606, 0.42989095],
[0.25923045, 0.53953228, 0.32621582, 0.43297785],
[0.27078601, 0.53333689, 0.31391997, 0.4531239 ],
[0.28204362, 0.55253638, 0.30399583, 0.48110336],
[0.2905511 , 0.59113979, 0.29693304, 0.50782682],
[0.29877746, 0.65041821, 0.28764287, 0.53247815],
[0.30240836, 0.71207879, 0.34604287, 0.54785854]],
[[0.25228031, 0.61725465, 0.3161332 , 0.45023995],
[0.24793654, 0.58854431, 0.32644507, 0.43765143],
[0.25025404, 0.55537186, 0.33264606, 0.42989095],
[0.25923045, 0.53953228, 0.32621582, 0.43297785],
[0.27078601, 0.53333689, 0.31391997, 0.4531239 ],
[0.28204362, 0.55253638, 0.30399583, 0.48110336],
[0.2905511 , 0.59113979, 0.29693304, 0.50782682],
[0.29877746, 0.65041821, 0.28764287, 0.53247815],
[0.30240836, 0.71207879, 0.34604287, 0.54785854],
[0.30113961, 0.7603975 , 0.4250553 , 0.55976262]]])
以及相应的目标数组(train_gen [0] [1] [:3]):
array([[0.30240836, 0.30113961],
[0.30113961, 0.30203943],
[0.30203943, 0.31435152]])
现在,使用Keras库,我的模型非常简单
h = LSTM(50)(inputs)
output = Dense(2)(h)
model = Model(inputs,output)
model.compile(loss='mae', optimizer='adam')
问题出在我开始训练时:
history = model.fit_generator(generator=train_gen,
epochs=50,
validation_data=test_gen,
shuffle=False)
Epoch 1/50
40/40 [==============================] - 5s 120ms/step - loss: nan - val_loss: nan
Epoch 2/50
40/40 [==============================] - 1s 37ms/step - loss: nan - val_loss: nan
Epoch 3/50
40/40 [==============================] - 2s 39ms/step - loss: nan - val_loss: nan
请注意在每个纪元(顺便说一句,在纪元末尾)出现的“ nan”。
谁能给我一些有关如何找到问题的提示?我应该提到,当输出(即目标)仅为(x1(t))时,学习就可以了,并且火车损耗和测试损耗可以平稳收敛。
答案 0 :(得分:0)
事实上,我找到了问题的根源:这是因为在使用TimeseriesGenerator例如使用data_train之前,我必须使用以下代码来生成{x1(t),x1(t + 1) )}目标
target_train = np.transpose(np.stack((data_train[:,0],
shift(data_train[:,0],-1,cval=np.NaN))))
但是,最后一个条目是类似的东西([0.18358087,nan]),因此损失的计算被最后一个条目所破坏。
解决方案是删除它。