如何将训练有素的模型与未经训练的模型合并?

时间:2020-05-21 17:34:48

标签: python tensorflow keras tensorflow2.0 tf.keras

我很难做到以下几点:

  • 我有一个已经训练了125,089,410个可训练参数的模型;
  • 模型输出两个具有形状的张量(无,96);
  • 我想通过冻结先前模型中的各层,然后输出一个(无,96)张量来构建新模型。

重要:我无意在我训练的原始模型中添加更多层。

这就是我一直在尝试的东西:

def get_output_model (prev_model):

    # Freezing prev model
    for l in prev_model.layers:
        l.trainable = False

    # Compiling so it won't complain about parameters number
    prev_model.compile(loss='binary_crossentropy')

    # Sanity check
    print (prev_model.summary())

    # Loss function
    def loss_fn(y_true, y_pred):
        pass # This doesn't matter here

    # Building model
    out_model_in = prev_model.output
    out_model = tf.keras.layers.Dense(2 * MAX_LEN, activation='linear')(out_model_in)
    out_model = tf.keras.layers.Activation('relu')(out_model)
    out_model = tf.keras.layers.BatchNormalization()(out_model)
    out_model = tf.keras.layers.Dense(2 * MAX_LEN, activation='linear')(out_model)
    out_model = tf.keras.layers.Activation('relu')(out_model)
    out_model = tf.keras.layers.BatchNormalization()(out_model)
    out_model = tf.keras.layers.Dense(2 * MAX_LEN, activation='linear')(out_model)
    out_model = tf.keras.layers.Activation('relu')(out_model)
    out_model = tf.keras.layers.BatchNormalization()(out_model)
    out_model = tf.keras.layers.Dense(MAX_LEN, activation='linear')(out_model)
    out_model = tf.keras.layers.Activation('softmax')(out_model)
    model = tf.keras.models.Model(inputs=[out_model_in], outputs=[out_model])
    model.compile(loss=loss_fn, optimizer='nadam', metrics=['accuracy', f1_m])

但是它给了我

ValueError: Layer dense_49 expects 1 inputs, but it received 2 input tensors. Inputs received: [<tf.Tensor 'activation_57/Identity:0' shape=(None, 96) dtype=float32>, <tf.Tensor 'activation_58/Identity:0' shape=(None, 96) dtype=float32>]

我了解这是预期的错误,但我不知道如何解决此问题。

1 个答案:

答案 0 :(得分:0)

好吧,我找到了一种方法,对我有用:

def get_output_model (prev_model):

    # Freezing prev model
    for l in prev_model.layers:
        l.trainable = False

    # Compiling so it won't complain about parameters number
    prev_model.compile(loss='binary_crossentropy')

    # Sanity check
    print (prev_model.summary())

    # Loss function
    def loss_fn(y_true, y_pred):
        pass # This doesn't matter here

    # Building model
    x1 = prev_model.output[0]
    x2 = prev_model.output[1]
    out_model = tf.keras.layers.Add()([x1, x2])
    out_model = tf.keras.layers.Dense(2 * MAX_LEN, activation='linear')(out_model)
    out_model = tf.keras.layers.Activation('relu')(out_model)
    out_model = tf.keras.layers.BatchNormalization()(out_model)
    out_model = tf.keras.layers.Dense(2 * MAX_LEN, activation='linear')(out_model)
    out_model = tf.keras.layers.Activation('relu')(out_model)
    out_model = tf.keras.layers.BatchNormalization()(out_model)
    out_model = tf.keras.layers.Dense(2 * MAX_LEN, activation='linear')(out_model)
    out_model = tf.keras.layers.Activation('relu')(out_model)
    out_model = tf.keras.layers.BatchNormalization()(out_model)
    out_model = tf.keras.layers.Dense(MAX_LEN, activation='linear')(out_model)
    out_model = tf.keras.layers.Activation('softmax')(out_model)
    model = tf.keras.models.Model(inputs=[prev_model.input], outputs=[out_model])
    model.compile(loss=loss_fn, optimizer='nadam', metrics=['accuracy', f1_m])
    return model