Question

当使用keras构建一个简单的cnn，如下面的代码，当它用于基于文本的问题，如文档分类时，我明白这就好像我们从文本中提取4克（kernel_size of 4）并将它们用作特征。

    model = Sequential()
    model.add(embedding_layer)
    model.add(Conv1D(filters=100, kernel_size=4, padding='same', activation='relu'))
    model.add(MaxPooling1D(pool_size=4))      
    model.add(Dense(4, activation='softmax'))
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

并且在这种情况下，conv1D层中的内核大小就像一个大小为4的滑动窗口，它遍历文本中的标记序列以发出4克。

我想知道这是否是一种方式，我们可以在卷积中创建非连续的滑动窗口，即，会产生“跳过 - 克”。当量。因此，例如，给定以下1d向量：

[a, b, c, d, e, f]

具有kernel_size = 3 skip = 1的conv1d将扫描以下序列：

[(a,c,d),(b,d,e),(c,e,f),(d,f,padding),(e,padding,padding)] union [(a,b,d),(b,c,e),(c,d,f),(d,e,padding),(e,f,padding),(f,padding,padding)]

我说＆＃39; union＆＃39;只是因为我认为从实现的角度来看，生成第1部分或第2部分可能更容易，为修订后的conv1d层提供另一个参数。如果是这样的话，我可以通过连接多个层来解决这个问题。但最小的实际上是有一个扩展的conv1d层，它将采用额外的参数，使其能够执行扫描的第一部分或第二部分。

这个想法并不新鲜，因为本文已经对它进行了实验：http://www.aclweb.org/anthology/D/D16/D16-1085.pdf

但请原谅我缺乏对keras的深入了解，我不知道如何实现它。请提出任何建议，

非常感谢提前

Answer 1

您可以创建自定义卷积图层，其中权重矩阵中的某些元素为零。

您可以将常规Conv1D图层作为基类。

但在此之前，请注意您可以创建一个＆＃34;扩张的＆＃34;通过在创建常规卷积层时传递dilation_rate=n参数进行卷积。这将在窗口中的每个克数之间跳过n-1克。你的窗口将有固定的常规空间。

为此创建自定义图层：

import keras.backend as K


#a 1D convolution that skips some entries
class SkipConv1D(Conv1D):

    #in the init, let's just add a parameter to tell which grams to skip
    def __init__(self, validGrams, **kwargs):

        #for this example, I'm assuming validGrams is a list
        #it should contain zeros and ones, where 0's go on the skip positions
        #example: [1,1,0,1] will skip the third gram in the window of 4 grams   
        assert len(validGrams) == kwargs.get('kernel_size')
        self.validGrams = K.reshape(K.constant(validGrams),(len(validGrams),1,1))
            #the chosen shape matches the dimensions of the kernel
            #the first dimension is the kernel size, the others are input and ouptut channels


        #initialize the regular conv layer:
        super(SkipConv1D,self).__init__(**kwargs)

        #here, the filters, size, etc, go inside kwargs, so you should use them named
        #but you may make them explicit in this __init__ definition
        #if you think it's more comfortable to use it like this


    #in the build method, let's replace the original kernel:
    def build(self, input_shape):

        #build as the original layer:
        super(SkipConv1D,self).build(input_shape)

        #replace the kernel   
        self.originalKernel = self.kernel
        self.kernel = self.validGrams * self.originalKernel

请注意在这个答案中没有注意到的一些事情：

方法get_weights()仍然会返回原始内核，而不是带有跳过掩码的内核。（可以解决这个问题，但如果有必要，还会有额外的工作，请告诉我）

此图层中有未使用的权重。这是一个简单的实现。这里的重点是使其与现有的Conv层保持最相似，具有其所有功能。它也可以仅使用严格必要的权重，但这会大大增加复杂性，并且需要大量重写keras原始代码以重新创建所有原始可能性。

如果你的kernel_size太长，定义validGrams var将会非常无聊。您可能想要创建一个版本，该版本采用一些跳过的索引，然后将其转换为上面使用的列表类型。

跳过不同克的不同渠道：

也可以在图层内执行此操作，如果不使用形状为validGrams的{{1}}，则使用形状为(length,)的图像。

在这种情况下，在我们创建validGrams矩阵的位置，我们应该重塑它，如：

(length,outputFilters)

您也可以简单地使用许多具有不同参数的并行validGrams = np.asarray(validGrams) shp = (validGrams.shape[0],1,validGrams.shape[1]) validGrams = validGrams.reshape(shp) self.validGrams = K.constant(validGrams)，然后连接它们的结果。

SkipConv1D

cnn max pooling - 非连续滑动窗口（跳过克）？

1 个答案:

请注意在这个答案中没有注意到的一些事情：

跳过不同克的不同渠道：