Question

我正在使用经过预训练的ResNet-50模型，并希望将倒数第二层的输出馈送到LSTM网络。这是我的仅包含CNN（ResNet-50）的示例代码：

N = NUMBER_OF_CLASSES
#img_size = (224,224,3)....same as that of ImageNet    
base_model = ResNet50(include_top=False, weights='imagenet',pooling=None)
x = base_model.output
x = GlobalAveragePooling2D()(x)
predictions = Dense(1024, activation='relu')(x)
model = Model(inputs=base_model.input, outputs=predictions)

接下来，我要将其馈送到LSTM网络，如下所示...

final_model = Sequential()
final_model.add((model))
final_model.add(LSTM(64, return_sequences=True, stateful=True))
final_model.add(Dense(N, activation='softmax'))

但是我很困惑如何将输出调整为LSTM输入。我的原始输入是（224 * 224 * 3）到CNN。另外，我应该使用TimeDistributed吗？

感谢您提供任何帮助。

Answer 1

在CNN之后添加LSTM没有多大意义，因为LSTM主要用于时间/序列信息，而您的数据似乎只是空间的，但是如果您仍然喜欢使用它，只需使用

x = Reshape((1024,1))(x)

这会将其转换为具有1个功能的1024个样本的序列

如果要谈论时空数据，请在Resnet层上使用Timedistributed，然后可以使用convlstm2d

Answer 2

在LSTM中使用预训练网络的示例：

inputs = Input(shape=(config.N_FRAMES_IN_SEQUENCE, config.IMAGE_H, config.IMAGE_W, config.N_CHANNELS))
cnn = VGG16(include_top=False, weights='imagenet', input_shape=(config.IMAGE_H, config.IMAGE_W, config.N_CHANNELS))
x = TimeDistributed(cnn)(inputs)
x = TimeDistributed(Flatten())(x)
x = LSTM(256)(x)

使用Tensorflow Keras将CNN与LSTM结合

2 个答案: