使用 BERT 转换器进行迁移学习

时间:2021-02-19 19:23:54

标签: python bert-language-model huggingface-transformers transfer-learning

我的输出标签采用以下格式进行单热编码:Positive、Negative、Mixed、Neutral with 1s and 0s e.g. [1 0 0 0] 将文本表示为正面

我已按如下方式预加载模型:

transformer_name = "bert-base-uncased"
pre_trained_model = TFBertForSequenceClassification.from_pretrained(transformer_name)
tokenizer = BertTokenizer.from_pretrained(transformer_name)

模型摘要给了我以下内容:

enter image description here

从那里,我使用 call 方法从这里创建层,如下所示:

inputs = tf.keras.Input(shape=(512,), dtype='int32') # 512 is input shape to transformer model
print(inputs)

pre_trained_model.call(inputs)

layer_position = 2 # choose last layer position in the model
count = 0

for layer in pre_trained_model.layers:
    print(layer.output)
    
    count = count + 1
    if count == layer_position:
        last_output = layer.output

print('last output is ', last_output)

last output is  Tensor("dropout_37/Identity:0", shape=(None, 768), dtype=float32)

然后我想将最后一层输入到我自己的自定义层中:

transformer_inputs = layers.GlobalAveragePooling1D()(last_output)
...

但得到以下错误:

Input 0 of layer global_average_pooling1d is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 768]

我理解这里的问题是 global_average_pooling1d需要 3dim 形状,因为只有 2。为什么变压器模型的输出后期形状只有 2dim - 我可以使用什么解决方案来解决这个问题?

  [1]: https://i.stack.imgur.com/Gqe9n.png

0 个答案:

没有答案