Question

我目前正在为文本数据执行多标签分类任务。我有一个带有ID列，文本列和几列的数据框，这些列是仅包含1或0的文本标签。

我使用了该网站Kaggle Toxic Comment Classification using Bert上提出的现有解决方案，该解决方案允许以百分比表示其对每个标签的归属程度。

现在，我已经训练了我的模型，我希望在不带标签的少量文本提取中对其进行测试，以获得属于每个标签的百分比：

我已经尝试过此解决方案：

def getPrediction(in_sentences):
  label = ['S1, S2, S3']
  input_examples = [run_classifier.InputExample(guid="", text_a = x, text_b = None, label=label) for x in in_sentences]
  input_features = run_classifier.convert_examples_to_features(input_examples, LABEL_COLUMNS, MAX_SEQ_LENGTH, tokenizer)
  predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=False)
  predictions = estimator.predict(predict_input_fn)
  return [(sentence, prediction['probabilities'], labels[prediction['labels']]) for sentence, prediction in zip(in_sentences, predictions)]

pred_sentences = [
  "here is an exemple of sentence"]

pred_sentences = ''.join(pred_sentences)

predictions = getPrediction(pred_sentences)

我得到了：

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-490-770bf0871d3e> in <module>
----> 1 predictions = getPrediction(pred_sentences)

<ipython-input-486-3de7328d60db> in getPrediction(in_sentences)
      2   label = ['S1','S2',
      3    'S3']
----> 4   input_examples = [run_classifier.InputExample(guid="", text_a = x, text_b = None, labels=label) for x in in_sentences]
      5   input_features = run_classifier.convert_examples_to_features(input_examples, LABEL_COLUMNS, MAX_SEQ_LENGTH, tokenizer)
      6   predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=False)

<ipython-input-486-3de7328d60db> in <listcomp>(.0)
      2   label = ['S1,
      3    S2,S3']
----> 4   input_examples = [run_classifier.InputExample(guid="", text_a = x, text_b = None, labels=label) for x in in_sentences]
      5   input_features = run_classifier.convert_examples_to_features(input_examples, LABEL_COLUMNS, MAX_SEQ_LENGTH, tokenizer)
      6   predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=False)

TypeError: __init__() got an unexpected keyword argument 'labels'

您知道需要进行哪些更改才能使算法的最后一部分正常工作吗？

Answer 1

您输入了错字，InputExample需要一个名为label而不是labels的关键字参数：

[run_classifier.InputExample(guid="", text_a = x, text_b = None, labels=label) for x in in_sentences]
                                                                      ^

将训练有素的BERT模型应用于预测部署

1 个答案: