背景

Question

背景

我一直在互联网上搜索以我想要的方式找到微调问题的解决方案。这就是我想要的：

训练顶层以获得一些时代。我想通过保存瓶颈功能来执行此任务，如＆＃34;使用预先训练的网络的瓶颈功能＆＃34;这段blog post。
然后以某种方式将此训练模型连接到任何预先训练的模型，例如ResNet50，VGG16或InceptionV3。然后进行微调。

我的方法出现问题

我想分别训练top_model，然后将其连接到base_model以进行罚款。我希望这种方法可以减少培训时间，因为我无法访问GPU。

以前我认为这件事可能是这样的：

 # build a classifier model to put on top of the convolutional model
top_model = Sequential()
top_model.add(Flatten(input_shape=model.output_shape[1:]))
top_model.add(Dense(256, activation='relu'))
top_model.add(Dropout(0.5))
top_model.add(Dense(1, activation='sigmoid'))

# note that it is necessary to start with a fully-trained
# classifier, including the top classifier,
# in order to successfully do fine-tuning
top_model.load_weights(top_model_weights_path)

# add the model on top of the convolutional base
model.add(top_model)

但现在我猜keras.applications.ResNet50和其他人没有使用Sequential API实现。现在，对于需要进行微调的人来说，必须以这种方式添加顶级模型：source

# create the base pre-trained model
base_model = InceptionV3(weights='imagenet', include_top=False)

# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)
# let's add a fully-connected layer
x = Dense(1024, activation='relu')(x)
# and a logistic layer -- let's say we have 200 classes
predictions = Dense(200, activation='softmax')(x)

# this is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)

因此，当使用这种方法时，不可能跳过多个前向传球，仅训练顶层。这导致高训练时间，因为所有瓶颈特征都是针对每个时期计算的。

我已经在线搜索了其他方法，但我无法找到适合我目的的方法。

问题

那么，有可能以我想要的方式做到这一点吗？或者我必须通过冻结基础层来训练整个网络，然后训练顶层。

Answer 1

keras功能API允许您定义和单独训练具有可能重叠层的模型。我现在无法访问keras环境，但是在我的头脑中你可以像：

# bottom model
bottom_input = Input(..)
bottom_function = Dense(..)
bottom_output = bottom_function(bottom_input)
bottom_model = Model(inputs=[bottom_input], outputs=[bottom_output])

# top model
top_function = Dense(..)
top_output = top_function(bottom_output)

# full model
full_model = Model(inputs=[bottom_input], outputs=[top_output])

# compile full model and fit bottom 
full_model.compile(..)
bottom_model.fit(..) 
# freeze bottom model
for layer in bottom_model.layers: layers.trainable = False
# fit top
full_model.fit(..)

我希望这很清楚

在Keras进行微调的困境

背景

我的方法出现问题

问题

1 个答案: