我想在CNN中引起注意。注意是(N,1),而N是批次大小。我想将其更改为(1,N),然后使用softmax。 Pytorch可以使用“转置”来实现。但是当我在keras中使用“ Permute”时,会出现错误:
Input 0 is incompatible with layer flatten_2: expected min_ndim=3, found ndim=2
我的代码在这里:
class AttentionModel:
def __init__(self):
self.L = 500
self.D = 128
self.K = 1
inputs = Input(shape=(28,28,1))
result1 = self.feature_extractor_part1(inputs)
result2 = self.feature_extractor_part2(result1) # (N,500)
attention=self.attention(result2) #(N,1)
attention=Permute(dims=(2,1))(attention) #(1,N) !!PROBLEM!!
attention=Flatten()(attention)
attention=Activation('softmax')(attention) #(1,N)
M=Dot()(attention,result2) #(K,L)
final_result=self.classifer(M)
self.model=Model(inputs=inputs,outputs=final_result)
self.model.compile(optimizer='adam',loss='categorical_crossentropy',metrics=['accuracy'])
def feature_extractor_part1(self, inputs):
conv1 = Conv2D(20, kernel_size=5, activation='relu')(inputs)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
conv2 = Conv2D(50, kernel_size=5, activation='relu')(pool1)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
return pool2
def feature_extractor_part2(self, inputs):
flat = Flatten()(inputs)
dense = Dense(self.L, activation='relu')(flat)
return dense
def attention(self, inputs):
flat1 = Dense(self.D, activation='tanh')(inputs)
flat2 = Dense(self.K)(flat1)
return flat2
def classifer(self,inputs):
result=Dense(1,activation='sigmoid')(inputs)
return result
答案 0 :(得分:0)
使用keras后端层transpose
并用Lamda
层包裹
像这样:
from keras import backend as K
from keras.layers import Input, Lambda
from keras.models import Model
seq = Input(shape=(1,))
mypermute = lambda x: K.transpose(x)
b = Lambda(mypermute)(seq)
model = Model(inputs=seq, outputs=b)
print(model.summary())
输出:
> Layer (type) Output Shape Param #
> =================================================================
> input_1 (InputLayer) (None, 1) 0
> _________________________________________________________________
> lambda_1 (Lambda) (1, None) 0
> =================================================================
> Total params:
> 0 Trainable params:
> 0 Non-trainable params: 0
> _________________________________________________________________
答案 1 :(得分:0)
您的attention=Permute(dims=(2,1))(attention)
行的问题在于您只是忽略了昏暗的批次。因此它将输出类似(batch_size*, N)
的内容,这显然是错误的。
如果将其更改为attention=Permute(dims=(0,2,1))(attention)
,它将起作用。根据您的要求,输出形状将为(batch_size, 1, N)
。