Question

我正在研究二进制文本分类问题，并在pytorch中使用Bert序列分类模型。这是link要合作的笔记本。

在训练了模型之后，我尝试对示例文本进行预测。我检查了input_id张量的形状为[1,128]。我使用了batch_size = 16。

review_text = "I love completing my todos! Best app ever!!!"

encoded_review = tokenizer.encode_plus(
  review_text,
  max_length=MAX_SEQ_LEN,
  add_special_tokens=True,
  return_token_type_ids=False,
  pad_to_max_length=True,
  return_attention_mask=True,
  return_tensors='pt',
)
input_ids = encoded_review['input_ids'].to(device)
print(input_ids.shape)
attention_mask = encoded_review['attention_mask'].to(device)
print(attention_mask.shape)
output = model(input_ids, attention_mask)
output

这会产生错误

Truncation was not explicitely activated but `max_length` is provided a specific value, please use `truncation=True` to explicitely truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
torch.Size([1, 128])
torch.Size([1, 128])
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-22-74eeba339c4c> in <module>()
     14 attention_mask = encoded_review['attention_mask'].to(device)
     15 print(attention_mask.shape)
---> 16 output = model(input_ids, attention_mask)
     17 output

7 frames
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    720             result = self._slow_forward(*input, **kwargs)
    721         else:
--> 722             result = self.forward(*input, **kwargs)
    723         for hook in itertools.chain(
    724                 _global_forward_hooks.values(),

<ipython-input-7-71a85d6c1fce> in forward(self, text, label)
      8 
      9     def forward(self, text, label):
---> 10         loss, text_fea = self.encoder(text, labels=label)[:2]
     11 
     12         return loss, text_fea

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    720             result = self._slow_forward(*input, **kwargs)
    721         else:
--> 722             result = self.forward(*input, **kwargs)
    723         for hook in itertools.chain(
    724                 _global_forward_hooks.values(),

/usr/local/lib/python3.6/dist-packages/transformers/modeling_bert.py in forward(self, input_ids, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, labels, output_attentions, output_hidden_states)
   1282             else:
   1283                 loss_fct = CrossEntropyLoss()
-> 1284                 loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1))
   1285             outputs = (loss,) + outputs
   1286 

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    720             result = self._slow_forward(*input, **kwargs)
    721         else:
--> 722             result = self.forward(*input, **kwargs)
    723         for hook in itertools.chain(
    724                 _global_forward_hooks.values(),

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/loss.py in forward(self, input, target)
    946     def forward(self, input: Tensor, target: Tensor) -> Tensor:
    947         return F.cross_entropy(input, target, weight=self.weight,
--> 948                                ignore_index=self.ignore_index, reduction=self.reduction)
    949 
    950 

/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction)
   2420     if size_average is not None or reduce is not None:
   2421         reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 2422     return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
   2423 
   2424 

/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in nll_loss(input, target, weight, size_average, ignore_index, reduce, reduction)
   2214     if input.size(0) != target.size(0):
   2215         raise ValueError('Expected input batch_size ({}) to match target batch_size ({}).'
-> 2216                          .format(input.size(0), target.size(0)))
   2217     if dim == 2:
   2218         ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)

ValueError: Expected input batch_size (1) to match target batch_size (128).

Answer 1

根据您在colab上的代码，您未将数据转换为[B，* dims]的指定形状，其中B是批处理大小，* dims是其他尺寸。因此，使用torch.unsqueeze(input_ids,0)阅读here可以将维度添加为batch_size。

BERT二进制文本分类器给出ValueError：预期的输入batch_size与目标不匹配

1 个答案: