我有 2 个独立的训练模型,可以独立预测来自同一图像的无关回归值。两个模型都使用预训练的 VGG16 模型作为基础,并添加了特定的顶层。
单独测试时,两种模型都表现良好。当我连接 2 个预训练模型时,我得到的预测与独立运行不同。
我如此声明各个模型:
# VGG
vggModel = tf.keras.applications.VGG16(weights='imagenet', include_top=False, input_shape=(310, 765, 3))
vggModel.trainable = True
trainableFlag = False
for layer in vggModel.layers:
if layer.name == 'block5_conv1':
trainableFlag = True
layer.trainable = trainableFlag
# Model A
model_a = tf.keras.Sequential(name='model_a')
model_a.add(vggModel)
model_a.add(tf.keras.layers.Flatten())
model_a.add(tf.keras.layers.Dropout(0.1))
model_a.add(tf.keras.layers.Dense(128, activation='relu', kernel_regularizer=tf.keras.regularizers.l1(0.01)))
model_a.add(tf.keras.layers.Dense(128, activation='relu', kernel_regularizer=tf.keras.regularizers.l1(0.01)))
model_a.add(tf.keras.layers.Dense(1, activation='linear'))
model_a.load_weights(model_a_wts)
# Model B
model_b = tf.keras.Sequential(name='model_b')
model_b.add(vggModel)
model_b.add(tf.keras.layers.Flatten())
model_b.add(tf.keras.layers.Dropout(0.1))
model_b.add(tf.keras.layers.Dense(256, activation='relu', kernel_regularizer=tf.keras.regularizers.l1(0.01)))
model_b.add(tf.keras.layers.Dense(256, activation='relu', kernel_regularizer=tf.keras.regularizers.l1(0.01)))
model_b.add(tf.keras.layers.Dense(1, activation='linear'))
model_b.load_weights(model_b_wts)
然后我通过以下方式连接模型:
common_input = tf.keras.Input(shape=(310, 765, 3))
a_out = model_a(common_input)
b_out = model_b(common_input)
concatOut = tf.keras.layers.Concatenate()([a_out, b_out])
branched_model = tf.keras.Model(common_input, concatOut, name='Branched')
在这种情况下如何获得不同的预测?
答案 0 :(得分:0)
根据@hkyi 的评论,答案是:
这两个模型并不完全独立,因为它们共享 vggModel(可训练)。因此,您需要先克隆 vggModel,然后再将其添加到 model_b。否则加载到 model_b 的权重将覆盖 model_a 中使用的 vggModel 权重。
改用 tf.keras.models.clone_model(vggModel):
# VGG
vggModel = tf.keras.applications.VGG16(weights='imagenet', include_top=False, input_shape=(310, 765, 3))
vggModel.trainable = True
trainableFlag = False
for layer in vggModel.layers:
if layer.name == 'block5_conv1':
trainableFlag = True
layer.trainable = trainableFlag
# Model A
model_a = tf.keras.Sequential(name='model_a')
model_a.add(vggModel)
model_a.add(tf.keras.layers.Flatten())
model_a.add(tf.keras.layers.Dropout(0.1))
model_a.add(tf.keras.layers.Dense(128, activation='relu', kernel_regularizer=tf.keras.regularizers.l1(0.01)))
model_a.add(tf.keras.layers.Dense(128, activation='relu', kernel_regularizer=tf.keras.regularizers.l1(0.01)))
model_a.add(tf.keras.layers.Dense(1, activation='linear'))
model_a.load_weights(model_a_wts)
# Model B
model_b = tf.keras.Sequential(name='model_b')
model_b.add(tf.keras.models.clone_model(vggModel)) <--- HERE IS THE CHANGE REQUIRED
model_b.add(tf.keras.layers.Flatten())
model_b.add(tf.keras.layers.Dropout(0.1))
model_b.add(tf.keras.layers.Dense(256, activation='relu', kernel_regularizer=tf.keras.regularizers.l1(0.01)))
model_b.add(tf.keras.layers.Dense(256, activation='relu', kernel_regularizer=tf.keras.regularizers.l1(0.01)))
model_b.add(tf.keras.layers.Dense(1, activation='linear'))
model_b.load_weights(model_b_wts)