LSTM和CNN:ValueError:检查目标时出错:预期time_distributed_1有3个维度,但得到的形状为数组(400,256)

时间:2017-08-09 12:06:43

标签: python deep-learning keras

我想对我的数据应用CNNLSTM,我只选择一小组数据;我的培训数据大小为(400,50),我的测试数据为(200,50)。 只有CNN模型,它没有任何错误,我添加LSTM模型时只有很多错误:

model = Sequential()
model.add(Conv1D(filters=8,
                 kernel_size=16,
                 padding='valid',
                 activation='relu',
                 strides=1, input_shape=(50,1)))
model.add(MaxPooling1D(pool_size=2,strides=None, padding='valid', input_shape=(50,1))) # strides=None means strides=pool_size
model.add(Conv1D(filters=8,
                 kernel_size=8,
                 padding='valid',
                 activation='relu',
                 strides=1))
model.add(MaxPooling1D(pool_size=2,strides=None, padding='valid',input_shape=(50,1)))
model.add(LSTM(32, return_sequences=True,
              activation='tanh', recurrent_activation='hard_sigmoid',
              dropout=0.2,recurrent_dropout=0.2)) # 100 num of LSTM units
model.add(LSTM(32, return_sequences=True,
              activation='tanh', recurrent_activation='hard_sigmoid',
              dropout=0.2,recurrent_dropout=0.2))
model.add(LSTM(32, return_sequences=True,
              activation='tanh', recurrent_activation='hard_sigmoid',
              dropout=0.2,recurrent_dropout=0.2))
model.add(LSTM(32, return_sequences=True,
              activation='tanh', recurrent_activation='hard_sigmoid',
              dropout=0.2,recurrent_dropout=0.2))
model.add(LSTM(32, return_sequences=True,
              activation='tanh', recurrent_activation='hard_sigmoid',
              dropout=0.2,recurrent_dropout=0.2))
model.add(TimeDistributed(Dense(256, activation='softmax')))

# # # 4. Compile model
print('########################### Compilation of the model ######################################')
model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
print(model.summary())
print('###########################Fitting the model ######################################')
# # # # # 5. Fit model on training data
x_train = x_train.reshape((400,50,1))
print(x_train.shape) # (400,50,1)
x_test = x_test.reshape((200,50,1))
print(x_test.shape) # (200,50,1)
model.fit(x_train, y_train, batch_size=100, epochs=100,verbose=0)
print(model.summary()) 
# # # # # 6. Evaluate model on test data
score = model.evaluate(x_test, y_test, verbose=0)
print (score)

这是错误:

Traceback (most recent call last):
  File "CNN_LSTM_Based_Attack.py", line 156, in <module>
    model.fit(x_train, y_train, batch_size=100, epochs=100,verbose=0)
  File "/home/doc/.local/lib/python2.7/site-packages/keras/models.py", line 853, in fit
    initial_epoch=initial_epoch)
  File "/home/doc/.local/lib/python2.7/site-packages/keras/engine/training.py", line 1424, in fit
    batch_size=batch_size)
  File "/home/doc/.local/lib/python2.7/site-packages/keras/engine/training.py", line 1304, in _standardize_user_data
    exception_prefix='target')
  File "/home/doc/.local/lib/python2.7/site-packages/keras/engine/training.py", line 127, in _standardize_input_data
    str(array.shape))
ValueError: Error when checking target: expected time_distributed_1 to have 3 dimensions, but got array with shape (400, 256)

你可以在这里找到这个模型的完整摘要:(我是LSTM的新手,这是我第一次使用它。)

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
conv1d_1 (Conv1D)            (None, 35, 8)             136
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 17, 8)             0
_________________________________________________________________
dropout_1 (Dropout)          (None, 17, 8)             0
_________________________________________________________________
conv1d_2 (Conv1D)            (None, 10, 8)             520
_________________________________________________________________
max_pooling1d_2 (MaxPooling1 (None, 5, 8)              0
_________________________________________________________________
dropout_2 (Dropout)          (None, 5, 8)              0
_________________________________________________________________
lstm_1 (LSTM)                (None, 5, 32)             5248
_________________________________________________________________
lstm_2 (LSTM)                (None, 5, 32)             8320
_________________________________________________________________
lstm_3 (LSTM)                (None, 5, 32)             8320
_________________________________________________________________
lstm_4 (LSTM)                (None, 5, 32)             8320
_________________________________________________________________
lstm_5 (LSTM)                (None, 5, 32)             8320
_________________________________________________________________
time_distributed_1 (TimeDist (None, 5, 256)            8448
=================================================================
Total params: 47,632
Trainable params: 47,632
Non-trainable params: 0
_________________________________________________________________

当我替换这行代码时:

model.add(LSTM(32, return_sequences=True,
              activation='tanh', recurrent_activation='hard_sigmoid',
              dropout=0.2,recurrent_dropout=0.2)) # 100 num of LSTM units
model.add(LSTM(32, return_sequences=True,
              activation='tanh', recurrent_activation='hard_sigmoid',
              dropout=0.2,recurrent_dropout=0.2))
model.add(LSTM(32, return_sequences=True,
              activation='tanh', recurrent_activation='hard_sigmoid',
              dropout=0.2,recurrent_dropout=0.2))
model.add(LSTM(32, return_sequences=True,
              activation='tanh', recurrent_activation='hard_sigmoid',
              dropout=0.2,recurrent_dropout=0.2))
model.add(LSTM(32, return_sequences=True,
              activation='tanh', recurrent_activation='hard_sigmoid',
              dropout=0.2,recurrent_dropout=0.2))
model.add(TimeDistributed(Dense(256, activation='softmax')))

只有这一行:

model.add(LSTM(26, activation='tanh'))

比它运作得很好。

如果你能帮助我,我将不胜感激。

1 个答案:

答案 0 :(得分:4)

因此LSTM层期望输入形状(样本,时间步长,特征)。堆叠LSTM时,你应该return_sequences = True。这将给出形状的输出(样本,时间步长,单位),从而允许堆栈合在一起 - 如果您只想预测一步(即下一个值),您应该在最后一个LSTM层上设置return_sequences = False在序列/时间序列中) - 如果不这样做,它将预测与输入中相同的时间步数。你可以预测一个不同的数字(例如,给定50个过去的观察预测接下来的10个,但在Keras中有点棘手)。

在你的情况下,Conv / MaxPool层输出5个“时间步长”,你在最后一个LSTM层上有return_sequences = True - 所以你的“y”必须有形状(Samples,5,256) - 否则转为return_sequences =在最后一层上为假,并且不使用TimeDistributed,因为您只预测前一步。