Question

我想在CNN中引起注意。注意是（N，1），而N是批次大小。我想将其更改为（1，N），然后使用softmax。 Pytorch可以使用“转置”来实现。但是当我在keras中使用“ Permute”时，会出现错误：

Input 0 is incompatible with layer flatten_2: expected min_ndim=3, found ndim=2

我的代码在这里：

class AttentionModel:
    def __init__(self):
        self.L = 500
        self.D = 128
        self.K = 1

        inputs = Input(shape=(28,28,1))
        result1 = self.feature_extractor_part1(inputs)
        result2 = self.feature_extractor_part2(result1)  # (N,500)

        attention=self.attention(result2) #(N,1)
        attention=Permute(dims=(2,1))(attention) #(1,N) !!PROBLEM!!
        attention=Flatten()(attention)
        attention=Activation('softmax')(attention) #(1,N)

        M=Dot()(attention,result2) #(K,L)

        final_result=self.classifer(M)

        self.model=Model(inputs=inputs,outputs=final_result)

        self.model.compile(optimizer='adam',loss='categorical_crossentropy',metrics=['accuracy'])

    def feature_extractor_part1(self, inputs):
        conv1 = Conv2D(20, kernel_size=5, activation='relu')(inputs)
        pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
        conv2 = Conv2D(50, kernel_size=5, activation='relu')(pool1)
        pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)

        return pool2

    def feature_extractor_part2(self, inputs):
        flat = Flatten()(inputs)
        dense = Dense(self.L, activation='relu')(flat)

        return dense

    def attention(self, inputs):
        flat1 = Dense(self.D, activation='tanh')(inputs)
        flat2 = Dense(self.K)(flat1)

        return flat2

    def classifer(self,inputs):
        result=Dense(1,activation='sigmoid')(inputs)

        return result

Answer 1

使用keras后端层transpose并用Lamda层包裹像这样：

from keras import backend as K
from keras.layers import Input, Lambda
from keras.models import Model

seq = Input(shape=(1,))
mypermute = lambda x: K.transpose(x)
b = Lambda(mypermute)(seq)
model = Model(inputs=seq, outputs=b)
print(model.summary())

输出：

> Layer (type)                 Output Shape              Param #
> ================================================================= 
> input_1 (InputLayer)         (None, 1)                 0
> _________________________________________________________________ 
> lambda_1 (Lambda)            (1, None)                 0
> ================================================================= 
> Total params: 
> 0 Trainable params: 
> 0 Non-trainable params: 0
> _________________________________________________________________

Answer 2

您的attention=Permute(dims=(2,1))(attention)行的问题在于您只是忽略了昏暗的批次。因此它将输出类似(batch_size*, N)的内容，这显然是错误的。

如果将其更改为attention=Permute(dims=(0,2,1))(attention)，它将起作用。根据您的要求，输出形状将为(batch_size, 1, N)。

如何以批量大小交换喀拉拉邦中的暗淡？

2 个答案: