Question

嗨，我在保存和加载tensorflow模型时遇到了一些严重的问题，该模型是由拥抱面部变形器和一些自定义层组成的分类。我正在使用最新的Huggingface变形金刚tensorflow keras版本。这个想法是使用distilbert提取要素，然后通过CNN运行要素以进行分类和提取。只要获得正确的分类，我就可以正常工作。

问题在于，一旦经过训练，就保存模型，然后再次加载模型。

我正在使用tensorflow keras和tensorflow 2.2版

以下是设计模型，对其进行训练，对其进行评估然后保存并加载的代码


    bert_config = DistilBertConfig(dropout=0.2, attention_dropout=0.2, output_hidden_states=False)
    bert_config.output_hidden_states = False
    transformer_model = TFDistilBertModel.from_pretrained(DISTIL_BERT, config=bert_config)

    input_ids_in = tf.keras.layers.Input(shape=(BERT_LENGTH,), name='input_token', dtype='int32')
    input_masks_in = tf.keras.layers.Input(shape=(BERT_LENGTH,), name='masked_token', dtype='int32')

    embedding_layer = transformer_model(input_ids_in, attention_mask=input_masks_in)[0]
    x = tf.keras.layers.Bidirectional(
        tf.keras.layers.LSTM(50, return_sequences=True, dropout=0.1,
                             recurrent_dropout=0, recurrent_activation="sigmoid",
                             unroll=False, use_bias=True, activation="tanh"))(embedding_layer)
    x = tf.keras.layers.GlobalMaxPool1D()(x)

    outputs = []
    # lots of code here to define the dense layers to generate the outputs
    # .....
    # .....

    model = Model(inputs=[input_ids_in, input_masks_in], outputs=outputs)
    for model_layer in model.layers[:3]:
        logger.info(f"Setting layer {model_layer.name} to not trainable")
        model_layer.trainable = False
    rms_optimizer = RMSprop(learning_rate=0.001)
    model.compile(loss=SigmoidFocalCrossEntropy(), optimizer=rms_optimizer)

    # the code to fit the model (which works)
    # then code to evaluate the model (which also works)

    # finally saving the model. This too works.
    tf.keras.models.save_model(model, save_url, overwrite=True, include_optimizer=True, save_format="tf")

但是，当我尝试使用以下内容加载保存的模型时

    tf.keras.models.load_model(
            path, custom_objects={"Addons>SigmoidFocalCrossEntropy": SigmoidFocalCrossEntropy})

我收到以下加载错误


ValueError: The two structures don't have the same nested structure.

First structure: type=TensorSpec str=TensorSpec(shape=(None, 128), dtype=tf.int32, name='inputs')

Second structure: type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='inputs/input_ids')}

More specifically: Substructure "type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='inputs/input_ids')}" is a sequence, while substructure "type=TensorSpec str=TensorSpec(shape=(None, 128), dtype=tf.int32, name='inputs')" is not
Entire first structure:
.
Entire second structure:
{'input_ids': .}

我认为问题是因为TFDistilBertModel层可以使用DistilBertTokenizer.encode（）的字典输入来调用，而这恰好是第一层。因此，加载时的模型编译器希望这是调用模型的输入签名。但是，定义给模型的输入是两个形状的张量（无，128）

那么，如何告诉加载函数或保存函数采用正确的签名？

Answer 1

我解决了这个问题。

问题是上述代码中的对象transformer_model本身不是一层。因此，如果要将其嵌入到另一个keras层中，则应使用模型中包装的内部keras层

因此更改行

embedding_layer = transformer_model(input_ids_in, attention_mask=input_masks_in[0]

到

embedding_layer = transformer_model.distilbert([input_ids_in, input_masks_in])[0]

使一切正常。希望这对其他人有帮助。尽管很显然，花了很长时间通过tf.keras代码进行调试才能弄清楚这一点。：）

Answer 2

昨天我偶然遇到了同样的问题。我的解决方案与您的解决方案非常相似，我认为问题是由于tensorflow keras如何处理自定义模型，因此，想法是在模型中使用自定义模型的层。这样做的好处是不用名称来明确调用该层（在我的情况下，这对于使用不同的预编码器轻松构建更通用的模型很有用）：

sent_encoder = getattr(transformers, self.model_name).from_pretrained(self.shortcut_weights).layers[0]

我没有探索HuggingFace的所有模型，但是我测试的一些模型似乎是只有一个自定义层的自定义模型。

您的解决方案也很吸引人，实际上，如果“ distilbert”引用“ .layers [0]”，则两个解决方案都是相同的。

具有保存和加载张量流模型的问题，该模型使用拥抱面变换器模型作为第一层

2 个答案: