Question

我已从找到的here的转换器存储库中下载了HuggingFace BERT模型，并希望通过使用run_ner.py脚本在自定义NER标签上训练该模型，因为该脚本在本节中被引用为here “命名实体识别”。

我在代码中将模型（默认为“ bert-base-german-cased”），data_dir（“ Data / sentence_data.txt”）和标签（“ Data / labels.txt”）定义为默认代码。

现在我在命令行中使用此输入：

python run_ner.py --output_dir="Models" --num_train_epochs=3 --logging_steps=100 --do_train --do_eval --do_predict

但是它所做的只是告诉我：

Some weights of the model checkpoint at bert-base-german-cased were not used when initializing BertForTokenClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.w
eight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForTokenClassification were not initialized from the model checkpoint at bert-base-german-cased and are newly initialized: ['classifier.weight', 'classifier.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

此后，它只是停止，而不是结束脚本，而只是等待。

有人知道这可能是什么问题吗？我是否缺少参数？

我的CoNLL格式的statement_data.txt看起来像这样（小片段）：

Strafverfahren O
gegen O
; O
wegen O
Diebstahls O
hat O
das O
Amtsgericht Ort
Leipzig Ort
- O
Strafrichter O

这就是我在labels.txt中定义标签的方式：

"Date", "Delikt", "Strafe_Tatbestand", "Schadensbetrag", "Geständnis_ja", "Vorstrafe_ja", "Vorstrafe_nein", "Ort",
"Strafe_Gesamtfreiheitsstrafe_Dauer", "Strafe_Gesamtsatz_Dauer", "Strafe_Gesamtsatz_Betrag"

Answer 1

找出问题所在。这与CUDA驱动程序与pytorch的安装版本不兼容有关。

对于使用Nvidia GPU遇到相同问题的任何人：转到Nvidia控制面板->帮助->系统信息->组件，名称栏中有一个名为“ NVCUDA.DLL”的驱动程序号。可以在pytorch.org的安装构建器中选择相应的CUDA版本。

此外，在Translators存储库中有一个不错的自述文件，解释了使用CLI命令here训练BERT模型的所有步骤。

使用CLI命令训练BERT

1 个答案: