我正在尝试在Keras中使用迁移学习。我已经为不同的任务训练了一个模型,但是现在我想将其用于类似的任务,但是输入和输出形状不同。
我使用load_model
加载了经过训练的模型。我的原始模型是:
model = Sequential()
model.add(Conv2D(32, (5,5), input_shape=(28,28,1), padding='same', activation='relu'))
model.add(Conv2D(32, (5,5), padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.25))
model.add(MaxPool2D(padding='same', strides=2))
model.add(Conv2D(128, (5, 5), padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.25))
model.add(MaxPool2D(padding='same', strides=2))
model.add(Conv2D(64, (4,4), padding='same', activation='relu'))
model.add(Conv2D(64, (4,4), padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.25))
model.add(MaxPool2D(padding='same', strides=2))
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(26, activation='softmax'))
rmsdrp = optimizers.rmsprop(lr=0.001, epsilon=1e-08)
model.compile( loss = "categorical_crossentropy",
optimizer = rmsdrp,
metrics=['accuracy']
)
然后,对于输出,我执行了以下操作:
model.pop()
model.add(Dense(3*168,activation='softmax'))
model.add(Reshape((3,168)))
这正在工作。对于输入,我这样做:
model.layers[0] = Input(shape=(137,236))
但是当我打印出模型摘要时,它仍然给出了模型的先前输入形状。我究竟做错了什么?我还应该如何更改输入形状?这是最后的模型摘要:
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 28, 28, 32) 832
_________________________________________________________________
conv2d_2 (Conv2D) (None, 28, 28, 32) 25632
_________________________________________________________________
batch_normalization_1 (Batch (None, 28, 28, 32) 128
_________________________________________________________________
dropout_1 (Dropout) (None, 28, 28, 32) 0
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 14, 14, 32) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 14, 14, 128) 102528
_________________________________________________________________
batch_normalization_2 (Batch (None, 14, 14, 128) 512
_________________________________________________________________
dropout_2 (Dropout) (None, 14, 14, 128) 0
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 7, 7, 128) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 7, 7, 64) 131136
_________________________________________________________________
conv2d_5 (Conv2D) (None, 7, 7, 64) 65600
_________________________________________________________________
batch_normalization_3 (Batch (None, 7, 7, 64) 256
_________________________________________________________________
dropout_3 (Dropout) (None, 7, 7, 64) 0
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 4, 4, 64) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 1024) 0
_________________________________________________________________
dense_1 (Dense) (None, 256) 262400
_________________________________________________________________
dropout_4 (Dropout) (None, 256) 0
_________________________________________________________________
dense_2 (Dense) (None, 504) 129528
_________________________________________________________________
reshape_1 (Reshape) (None, 3, 168) 0
=================================================================
Total params: 595,706
Trainable params: 595,258
Non-trainable params: 448
_________________________________________________________________
答案 0 :(得分:1)
问题似乎在于使用Input(shape=(137,236))
,它通常用于功能模型,而不是顺序模型。您可以通过本质上更改模型来更改输入层:
input = Input(shape=(137,236))
x = model.layers[1](input) #assuming you are ignoring the first conv layer as implied in your code
for layer in model.layers[2:]:
x = layer(x)
model = Model(inputs=input, outputs=x)
model.compile(*args, **kwargs)
为确保未获得权重,您需要添加一个for循环以将其设置为不可训练。
for layer in model.layers[1:-2]:
layer.trainable=False