Question

在使用Spacy时，我有以下内容：

import spacy

nlp = spacy.load('en_core_web_lg')

sentence = "a quick John jumps over the lazy dog"

tag_entities = [(x, x.ent_iob_, x.ent_type_) for x in nlp(sentence)]
inputlist = tag_entities

print (inputlist)

[(a, 'O', ''), (quick, 'O', ''), (John, 'B', 'PERSON'), (jumps, 'O', ''), (over, 'O', ''), (the, 'O', ''), (lazy, 'O', ''), (dog, 'O', '')]

这是一个元组列表。我要提取人员元素。这就是我要做的：

for i in inputlist:
  if (i)[2] == "PERSON":
    print ((i)[0])

John

有什么更好的方法？谢谢。

Answer 1

如果第二个元素是第一个列表中的PERSON，要保留所有第一个元素，请使用列表理解符号，并在末尾使用if

filtered_taglist = [x for x,_,type in tag_entities if type == "PERSON"]

这对应于

filtered_taglist = []
for x,_,type in inputlist:
    if type == "PERSON":
        filtered_taglist.append(x)

Answer 2

您可以在创建该列表时使用if：

tag_entities = [(x, x.ent_iob_, x.ent_type_) for x in nlp(sentence) if x.ent_type_ == 'PERSON']

或者直接是这些名称：

names = [(x, x.ent_iob_, x.ent_type_)[0] for x in nlp(sentence) if x.ent_type_ == 'PERSON']

从元组列表中获取特定元素

2 个答案: