如何获取BERT预训练模型中的中间层参数?

时间:2020-11-07 11:34:12

标签: python pytorch bert-language-model huggingface-transformers

import torch
import transformers
tokenizer = transformers.AlbertTokenizer.from_pretrained('albert-base-v2', do_lower_case=True)
transformer = transformers.AlbertModel.from_pretrained("albert-base-v2")

我尝试过:

transformer.num_parameters

但是,它为所有图层提供了参数:

<bound method ModuleUtilsMixin.num_parameters of AlbertModel(
  (embeddings): AlbertEmbeddings(
    (word_embeddings): Embedding(30000, 128, padding_idx=0)
    (position_embeddings): Embedding(512, 128)
    (token_type_embeddings): Embedding(2, 128)
    (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
    (dropout): Dropout(p=0, inplace=False)
  )
  (encoder): AlbertTransformer(
    (embedding_hidden_mapping_in): Linear(in_features=128, out_features=768, bias=True)
    (albert_layer_groups): ModuleList(
      (0): AlbertLayerGroup(
        (albert_layers): ModuleList(
          (0): AlbertLayer(
            (full_layer_layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
            (attention): AlbertAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (attention_dropout): Dropout(p=0, inplace=False)
              (output_dropout): Dropout(p=0, inplace=False)
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
            )
            (ffn): Linear(in_features=768, out_features=3072, bias=True)
            (ffn_output): Linear(in_features=3072, out_features=768, bias=True)
            (dropout): Dropout(p=0, inplace=False)
          )
        )
      )
    )
  )
  (pooler): Linear(in_features=768, out_features=768, bias=True)
  (pooler_activation): Tanh()
)>

我需要访问out_features = 768最后一个线性函数的输入

((池):线性(in_features = 768,out_features = 768,bias = True))

0 个答案:

没有答案