NER使用Spacy模型

时间:2019-10-06 05:09:44

标签: python spacy ner

我继续收到消息,我的语料库中没有NER。我期望猫,狗等会被识别为人。让我知道如何解决它。

import numpy as np
import pandas as pd

import spacy
from spacy import displacy

nlp = spacy.load("en_core_web_sm")

corpus=['cats are selfish', 'it is raining cats and dogs', 'dogs do not like birds','i do not like rabbits','i have eaten frogs snakes and alligators']

for sent in corpus:
    sentence_nlp = nlp(sent)
    # print named entities in sentences
    print([(word, word.ent_type_) for word in sentence_nlp if word.ent_type_])
    # visualize named entities
    displacy.render(sentence_nlp, style='ent', jupyter=True)

我得到的错误是:

[]
./NER_Spacy.py:19: UserWarning: [W006] No entities to visualize found in Doc object. If this is surprising to you, make sure the
Doc was processed using a model that supports named entity recognition, and check the `doc.ents` property manually if necessary
.
 displacy.render(sentence_nlp, style='ent', jupyter=False)
[]
./NER_Spacy.py:19: UserWarning: [W006] No entities to visualize found in Doc object. If this is surprising to you, make sure the
Doc was processed using a model that supports named entity recognition, and check the `doc.ents` property manually if necessary
.
 displacy.render(sentence_nlp, style='ent', jupyter=False)
[]
./NER_Spacy.py:19: UserWarning: [W006] No entities to visualize found in Doc object. If this is surprising to you, make sure the
Doc was processed using a model that supports named entity recognition, and check the `doc.ents` property manually if necessary
.
 displacy.render(sentence_nlp, style='ent', jupyter=False)
[]
./NER_Spacy.py:19: UserWarning: [W006] No entities to visualize found in Doc object. If this is surprising to you, make sure the
Doc was processed using a model that supports named entity recognition, and check the `doc.ents` property manually if necessary
.
 displacy.render(sentence_nlp, style='ent', jupyter=False)
[]
./NER_Spacy.py:19: UserWarning: [W006] No entities to visualize found in Doc object. If this is surprising to you, make sure the
Doc was processed using a model that supports named entity recognition, and check the `doc.ents` property manually if necessary
.
 displacy.render(sentence_nlp, style='ent', jupyter=False) ```

1 个答案:

答案 0 :(得分:1)

  

我希望猫,狗等会被识别为人

那么您就没有期待正确的事情:) Spacy的NER模型根据语言在不同的数据集上进行训练。对于您使用的模型,请参见此处: https://spacy.io/models/en#en_core_web_sm

用于训练您正在使用的模型的数据集称为“ Onto Notes 5”,并且不会像大多数人那样将猫和狗视为PERSON。 如果要将“猫”和“狗”作为实体,则需要使用自己的数据训练自己的NER模型。例如,您可以使用正则表达式规则和感兴趣的宠物列表将动物的某些数据标记为正则表达式,然后使用该标记的数据集来微调NER模型以执行所需的操作。