从spacy中的名词块中删除名称

时间:2018-11-29 08:07:15

标签: python-3.x nlp spacy named-entity-recognition

是否可以删除名词块中的人名?

这是代码

import en_vectors_web_lg
nlp = en_vectors_web_lg.load()
text = "John Smith is lookin for Apple ipod"
doc = nlp(text)
for chunk in doc.noun_chunks:
     print(chunk.text)

当前输出

John Smith
Apple ipod

我想有一个类似下面的输出,其中姓名被忽略。如何实现呢?

Apple ipod

1 个答案:

答案 0 :(得分:1)

引用spaCy ents

import spacy
# loading the model
nlp = spacy.load('en_core_web_lg')
doc = nlp(u'"John Smith is lookin for Apple ipod"')
# creating the filter list for tokens that are identified as person
fil = [i for i in doc.ents if i.label_.lower() in ["person"]]
# looping through noun chunks
for chunk in doc.noun_chunks:
    # filtering the name of the person
    if chunk not in fil:
        print(chunk.text)

输出:

Apple ipod

希望这会有所帮助。