Question

我对使用变压器和PyTorch调整因果语言模型有一些疑问。

我的主要目标是微调XLNet。但是，我发现大多数在线帖子都是针对文本分类的，例如post。我想知道，是否有任何方法可以在不使用变形金刚的GitHub上的run_language_model.py的情况下对模型进行微调？

这是我的一段代码，试图微调XLNet：

model = XLNetLMHeadModel.from_pretrained("xlnet-base-cased")
tokenizer = XLNetTokenizer.from_pretrained("xlnet-base-cased", do_lower_case=True)
LOSS = torch.nn.CrossEntrypoLoss()
batch_texts = ["this is sentence 1", "i have another sentence like this", "the final sentence"]
encodings = tokenizer.encode_plus(batch_texts, add_special_tokens=True,
                                  return_tensors=True, return_attention_mask=True)
outputs = model(encodings["input_ids"], encodings["attention_mask"])
loss = LOSS(outputs[0], target_ids)
loss.backward()
# ignoring the rest of codes...

我被困在最后两行。刚开始，当使用这种LM模型时，似乎我没有像通常的监督学习那样labels了。其次，作为使损失（在这里是交叉熵）最小的语言模型，我需要一个target_ids来计算input_ids的损失和困惑。

以下是我的后续问题：

在模型拟合期间我应该如何处理labels？
我应该设置target_ids=encodings["input_ids"].copy()之类的东西来计算交叉熵损失和困惑吗？
如果没有，应该如何设置此target_ids？
在转换器documentation的困惑页面上，我应该如何针对输入文本的非固定长度来调整其方法？
我从文档中看到另一个post，说它需要对因果语言建模使用填充文本。但是，在3）中的链接中，没有用于填充文本的符号。我应该跟随哪一个？

任何建议将不胜感激！

使用变形金刚和pytorch的微调因果语言模型

0 个答案: