我正在尝试使用TensorFlow后端在玩具数据上使用keras训练LSTM并收到此错误:
ValueError:检查目标时出错:预期density_39具有2维,但数组的形状为(996,1,1)
调用model.fit
后立即发生错误;似乎什么也没有。在我看来,Keras正在检查尺寸,但忽略了这样的事实,即每输入一批输入,它都会占用目标的 batches 。该错误显示了目标数组的完整维度,这对我意味着,至少在检查维度时,Keras从未将其拆分为多个批次。对于我的一生,我不知道为什么会这样,或者其他可能会有所帮助的事情。
我的网络定义带有预期的图层输出形状,并显示注释:
batch_shape = (8, 5, 1)
x_in = Input(batch_shape=batch_shape, name='input') # (8, 5, 1)
seq1 = LSTM(8, return_sequences=True, stateful=True)(x_in) # (8, 5, 8)
dense1 = TimeDistributed(Dense(8))(seq1) # (8, 5, 8)
seq2 = LSTM(8, return_sequences=False, stateful=True)(dense1) # (8, 8)
dense2 = Dense(8)(seq2) # (8, 8)
out = Dense(1)(dense2) # (8, 1)
model = Model(inputs=x_in, outputs=out)
optimizer = Nadam()
model.compile(optimizer=optimizer, loss='mean_squared_error')
model.summary()
模型摘要,形状符合预期:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input (InputLayer) (8, 5, 1) 0
_________________________________________________________________
lstm_28 (LSTM) (8, 5, 8) 320
_________________________________________________________________
time_distributed_18 (TimeDis (8, 5, 8) 72
_________________________________________________________________
lstm_29 (LSTM) (8, 8) 544
_________________________________________________________________
dense_38 (Dense) (8, 8) 72
_________________________________________________________________
dense_39 (Dense) (8, 1) 9
=================================================================
Total params: 1,017
Trainable params: 1,017
Non-trainable params: 0
_________________________________________________________________
我的玩具数据,目标只是一条从100到0减小的线,而输入只是零数组。我想进行一步一步的预测,因此我使用下面定义的rolling_window()
方法创建输入和目标的滚动窗口:
target = np.linspace(100, 0, num=1000)
target_rolling = rolling_window(target[4:], 1)[:, :, None]
target_rolling.shape # (996, 1, 1) <-- this seems to be the array that's causing the error
x_train = np.zeros((1000,))
x_train_rolling = rolling_window(x_train, 5)[:, :, None]
x_train_rolling.shape # (996, 5, 1)
rolling_window()
方法:
def rolling_window(arr, window):
shape = arr.shape[:-1] + (arr.shape[-1] - window + 1, window)
strides = arr.strides + (arr.strides[-1],)
return np.lib.stride_tricks.as_strided(arr, shape=shape, strides=strides)
我的训练循环:
reset_state = LambdaCallback(on_epoch_end=lambda _, _: model.reset_states())
callbacks = [reset_state]
history = model.fit(x_train_rolling, y_train_rolling,
batch_size=8,
epochs=100,
validation_split=0.,
callbacks=callbacks)
我尝试过:
return_sequence=True
,其后是Flatten
层。同样的错误。return_sequence=True
没有一个Flatten
层。这会产生不同的错误,因为它期望目标与输出的形状相同,此时目标为(batch_size, 5, 1)
,而不是(batch_size, 1, 1)
。请注意,尽管我确实对一对夫妇充满希望,但这些问题似乎都无法直接回答我的问题:
答案 0 :(得分:1)
发布我在评论中写的解决方案: 由于存在额外的尺寸,因此“ -1”可使尺寸自动调整为与其他尺寸相匹配的数字。由于仅给出二维,因此“(-1,1)”将使其变为“(996,1)”。
target_rolling = target_rolling.reshape(-1,1)
之前
at target_rolling.shape # (996, 1, 1)