Question

我想引起对训练有素的图像分类CNN模型的关注。例如，有30个类别，对于Keras CNN，我为每个图像都获得了预测的类别。但是，要可视化预测结果的重要特征/位置。我想在FC层之后添加一个Soft Attention。我试图阅读“ Show, Attend and Tell: Neural Image Caption Generation with Visual Attention”以获得相似的结果。但是，我不明白作者是如何实现的。因为我的问题不是图像标题或文本seq2seq问题。

我有一个图像分类CNN，想提取特征并将其放入LSTM中以可视化柔和的注意力。虽然我每次都会被卡住。

我采取的步骤：

加载CNN模型
从单个图像中提取特征（但是，LSTM会检查图像中已删除补丁的同一图像）

我采取的步骤：

加载CNN模型（我已经对CNN进行了预测训练）
从单个图像中提取特征（但是，LSTM会检查图像中已删除补丁的同一图像）

在执行以下步骤后被卡住：

轻柔地创建LSTM
获得单个输出

我正在使用具有TensorFlow背景的Keras。使用ResNet50提取CNN功能。图片为224x224，FC图层的输出形状为2048个单位。

#Extract CNN features:

base_model = load_model(weight_file, custom_objects={'custom_mae': custom_mae})
last_conv_layer = base_model.get_layer("global_average_pooling2d_3")
cnn_model = Model(input=base_model.input, output=last_conv_layer.output)
cnn_model.trainable = False
bottleneck_features_train_v2 = cnn_model.predict(train_gen.images)


#Create LSTM:    

seq_input = Input(shape=(1, 224, 224, 3 ))
encoded_frame = TimeDistributed(cnn_model)(seq_input)
encoded_vid = LSTM(2048)(encoded_frame) 
lstm = Dropout(0.5)(encoded_vid)

#Add soft attention

attention = Dense(1, activation='tanh')(lstm)
attention = Flatten()(attention)
attention = Activation('softmax')(attention)
attention = RepeatVector(units)(attention)
attention = Permute([2, 1])(attention)


#output 101 classes
predictions = Dense(101, activation='softmax', name='pred_age')(attention)

我希望从最后一个FC层提供图像功能。向LSTM添加软注意力以训练注意力权重，并希望从输出中获取一个类，并将软注意力可视化以了解系统在进行预测时所处的位置（类似于本文中的软注意力可视化）。

由于我是注意力机制的新手，因此我进行了大量研究，但找不到解决问题的方法。我想知道我是否做对了。

在Keras分类中实现关注

0 个答案: