我想建立一个具有以下特征的端到端可训练模型:
(它或多或少类似于本文中的图2:https://arxiv.org/pdf/1611.07890.pdf)
我现在的问题是重塑后,如何使用Keras或Tensorflow将特征矩阵的值提供给LSTM?
这是我目前使用VGG16网络的代码(也是指向Keras issues的链接):
# VGG16
model = Sequential()
model.add(Conv2D(64, (3, 3), activation='relu', padding='same', input_shape=(224, 224, 3)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
# block 2
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
# block 3
model.add(Conv2D(256, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(256, (3, 3), activation='relu'))
model.add(Conv2D(256, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
# block 4
model.add(Conv2D(512, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(512, (3, 3), activation='relu'))
model.add(Conv2D(512, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
# block 5
model.add(Conv2D(512, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(512, (3, 3), activation='relu'))
model.add(Conv2D(512, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
# block 6
model.add(Flatten())
model.add(Dense(4096, activation='relu'))
model.add(Dense(4096, activation='relu'))
# reshape the feature 4096 = 64 * 64
model.add(Reshape((64, 64)))
# How to feed each row of this to LSTM?
# This is my first solution but it doesn’t look correct:
# model.add(LSTM(256, input_shape=(64, 1))) # 256 hidden units, sequence length = 64, feature dim = 1
答案 0 :(得分:0)
请考虑使用Conv2D和MaxPool2D层构建CNN模型,直到到达Flatten层为止,因为Flatten层的矢量化输出将是您向结构的LSTM部分输入数据。
因此,像这样构建您的CNN模型:
model_cnn = Sequential()
model_cnn.add(Conv2D...)
model_cnn.add(MaxPooling2D...)
...
model_cnn.add(Flatten())
现在,这很有趣,Keras的当前版本与某些TensorFlow结构不兼容,这些结构不允许您将整个层堆叠在一个顺序对象中。
因此是时候使用Keras模型对象通过一个技巧来完善您的神经网络了:
input_lay = Input(shape=(None, ?, ?, ?)) #dimensions of your data
time_distribute = TimeDistributed(Lambda(lambda x: model_cnn(x)))(input_lay) # keras.layers.Lambda is essential to make our trick work :)
lstm_lay = LSTM(?)(time_distribute)
output_lay = Dense(?, activation='?')(lstm_lay)
最后,现在是时候将我们两个分离的模型放在一起了:
model = Model(inputs=[input_lay], outputs=[output_lay])
model.compile(...)
OBS:请注意,一旦VGG Flatten层的矢量化输出将成为LSTM模型的输入,就可以用VGG代替我的model_cnn示例,而无需包含顶层。