Question

我想在keras中为cifar10数据集训练2个模型。首先，从头开始（模型1），其次是通过微调预先训练的模型（模型2）。我使用以下代码来做到这一点：

from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
import numpy as np
import os
from keras.models import load_model

#model 1
input_shape = (32, 32, 3)
model1 = Sequential()
model1.add(Conv2D(32, kernel_size=(3, 3),activation='relu',input_shape=input_shape))
model1.add(Conv2D(64, (3, 3), activation='relu'))
model1.add(MaxPooling2D(pool_size=(2, 2)))
model1.add(Dropout(0.25))
model1.add(Flatten())
model1.add(Dense(128, activation='relu'))
model1.add(Dropout(0.5))
model1.add(Dense(10, activation='softmax'))
#... training

#model 2
kmodel = load_model('cifar10\\cifar10.h5')
model2=Sequential()
for i in range (len(kmodel.layers)):
    model2.add(kmodel.layers[i])

我想知道：

在模型1中：

如何在某些中间层之后添加softmax层（model1.add(Dense(10, activation='softmax'))），以便对于这些新的softmax层中的每一个，我仅与上一层有连接，而与下一层？

在模型2中：

如何将softmax层添加到也具有上述条件的连接的中间层（即第2层，第4层，第7层）？（当然，我应该冻结所有kmodel层，然后训练新的softmax层）

Answer 1

这里的限制是Keras的Sequential()运算符，它允许您仅线性堆叠图层。

为了避免这种情况，我们可以简单地以更直接（但更难看）的方式指定模型as described here。您的代码看起来像这样：

input_shape = (32, 32, 3)

x = Conv2D(32, kernel_size=(3, 3),activation='relu',input_shape=input_shape)(inputs)
x = Conv2D(64, kernel_size=(3, 3),activation='relu',input_shape=input_shape)(x)
...
predictions = Dense(10, activation='softmax')(x)

然后您可以简单地在中间层中将您的预测指定为

input_shape = (32, 32, 3)

x = Conv2D(32, kernel_size=(3, 3),activation='relu',input_shape=input_shape)(inputs)
x = Conv2D(64, kernel_size=(3, 3),activation='relu',input_shape=input_shape)(x)
...

# do the splitting at some random point of your choice
predictions_intermediary = Dense(10, activation='softmax')(x)
# regular next layer
x = Dense(128, activation='relu')(x) 
predictions = Dense(10, activation='softmax')(x)

可悲的是，我对Keras不够熟悉，无法告诉您它如何适用于预训练模型，但是我假设您可以以某种方式类似地定义预训练模型，然后像前面的示例一样指定可训练的层。

请注意，这里的“共享与拆分”问题已过时，因为创建不同的图层/操作会自动创建不同的权重矩阵，因此您不必担心在这里共享权重（无论如何都不会共享权重）如果您在下一个输入层中输入的尺寸与softmax输入形状相比有所不同，则可以使用。

在添加到keras中的（预训练/非训练）神经网络的中间层之后训练softmax层

1 个答案: