卷积层特征图上的特殊功能

时间:2019-01-25 14:02:18

标签: python tensorflow keras deep-learning conv-neural-network

简而言之:

如何将特征映射从Keras中定义的卷积层传递到特殊功能(区域提议者),然后将其传递到其他Keras层(例如Softmax分类器)?

长:

我正在尝试在Keras中实现类似Fast R-CNN Faster R-CNN)的功能。这样做的原因是因为我正在尝试实现自定义架构,如下图所示:

from "TextMaps" by Tom Gogar

以下是上图的代码(不包括候选输入):

from keras.layers import Input, Dense, Conv2D, ZeroPadding2D, MaxPooling2D, BatchNormalization, concatenate
from keras.activations import relu, sigmoid, linear
from keras.initializers import RandomUniform, Constant, TruncatedNormal, RandomNormal, Zeros

#  Network 1, Layer 1
screenshot = Input(shape=(1280, 1280, 0),
                   dtype='float32',
                   name='screenshot')
conv1 = Conv2D(filters=96,
               kernel_size=11,
               strides=(4, 4),
               activation=relu,
               padding='same')(screenshot)
pooling1 = MaxPooling2D(pool_size=(3, 3),
                        strides=(2, 2),
                        padding='same')(conv1)
normalized1 = BatchNormalization()(pooling1)  # https://stats.stackexchange.com/questions/145768/importance-of-local-response-normalization-in-cnn

# Network 1, Layer 2

conv2 = Conv2D(filters=256,
               kernel_size=5,
               activation=relu,
               padding='same')(normalized1)
normalized2 = BatchNormalization()(conv2)
conv3 = Conv2D(filters=384,
               kernel_size=3,
               activation=relu,
               padding='same',
               kernel_initializer=RandomNormal(stddev=0.01),
               bias_initializer=Constant(value=0.1))(normalized2)

# Network 2, Layer 1

textmaps = Input(shape=(160, 160, 128),
                 dtype='float32',
                 name='textmaps')
txt_conv1 = Conv2D(filters=48,
                   kernel_size=1,
                   activation=relu,
                   padding='same',
                   kernel_initializer=RandomNormal(stddev=0.01),
                   bias_initializer=Constant(value=0.1))(textmaps)

# (Network 1 + Network 2), Layer 1

merged = concatenate([conv3, txt_conv1], axis=-1)
merged_padding = ZeroPadding2D(padding=2, data_format=None)(merged)
merged_conv = Conv2D(filters=96,
                     kernel_size=5,
                     activation=relu, padding='same',
                     kernel_initializer=RandomNormal(stddev=0.01),
                     bias_initializer=Constant(value=0.1))(merged_padding)

如上所述,我要构建的网络的最后一步是 ROI合并,这是在R-CNN中完成的:

from main publication of Fast R-CNN on Arxiv

现在there is a code for ROI Pooling layer in Keras,但是我需要通过区域提议到该层。您可能已经知道,区域提议通常由称为“选择性搜索” which is already implemented in the Python的算法完成。


问题:

选择性搜索可以轻松获取正常图像,并为我们提供如下区域建议:

from selective search Github page

现在问题是,我应该从图层merged_conv1传递特征映射,而不是图像,如上面的代码所示:

merged_conv = Conv2D(filters=96,
                     kernel_size=5,
                     activation=relu, padding='same',
                     kernel_initializer=RandomNormal(stddev=0.01),
                     bias_initializer=Constant(value=0.1))(merged_padding)

上面的层不过是形状的引用,因此很显然,它不能与selectivesearch一起使用:

>>> import selectivesearch
>>> selectivesearch.selective_search(merged_conv, scale=500, sigma=0.9, min_size=10)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/somepath/selectivesearch.py", line 262, in selective_search
    assert im_orig.shape[2] == 3, "3ch image is expected"
AssertionError: 3ch image is expected

我想我应该这样做:

from keras import Model
import numpy as np
import cv2
import selectivesearch
img = cv2.imread('someimage.jpg')
img = img.reshape(-1, 1280, 1280, 3)
textmaps = np.ones(-1, 164, 164, 128)  # Just for example
model = Model(inputs=[screenshot, textmaps], outputs=merged_conv)
model.compile(optimizer='sgd', loss='binary_crossentropy', metrics=['accuracy'])
feature_maps = np.reshape(model.predict([img, textmaps]), (96, 164, 164))
feature_map_1 = feature_maps[0][0]
img_lbl, regions = selectivesearch.selective_search(feature_map_1, scale=500, sigma=0.9, min_size=10)

但是如果我要添加,假设softmax分类器接受“ regions”变量,该怎么办? (顺便说一句,我知道选择性搜索除了通道3的输入外几乎没有问题,但这与问题无关)

问题:

区域建议(使用选择性搜索)是神经网络的重要组成部分,如何修改它,以便从卷积层merged_conv获取特征图(激活)?

也许我应该创建自己的Keras图层?

1 个答案:

答案 0 :(得分:1)

据我所知,selective-search接受输入并返回n个不同(H,W)的补丁。因此,在您的情况下,feature-map是暗淡的(164,164,96),您可以假设(164,164)作为选择性搜索的输入,它将为您提供n的补丁数量为(H1,W1), (H2,W2),...。因此,您现在可以按原样将所有channel附加到该补丁,这样它就变成暗淡的(H1,W1,96),(H2,W2,96),....

注意:但是这样做也有缺点。 Selective-Search算法使用的策略是将图像分成网格,然后根据对象的热图重新加入这些补丁。您将无法在功能图上执行此操作。但是您可以在上面使用随机搜索方法,这很有用。