我正在尝试使用基于规则的匹配器(即添加on_match规则),在spaCy guide之后创建一个名为FRUIT的自定义实体标签。我正在使用spaCy 2.0.11,所以我相信与spaCy 1.X相比,这样做的步骤已经改变
示例:doc = nlp('汤姆想在联合国吃一些苹果')
预期的文本和实体产出:
Tom PERSON
apples FRUIT
the United Nations ORG
但是,我似乎收到以下错误: [E084]错误将标签ID 7429577500961755728分配给span:不在StringStore中。我在下面提供了我的代码。当我将nlp.vocab.strings ['FRUIT']更改为nlp.vocab.strings ['EVENT']时,奇怪的是它可以正常工作,但苹果将被分配实体标签EVENT。其他人遇到这个问题?
doc = nlp('Tom wants to eat some apples at the United Nations')
FRUIT = nlp.vocab.strings['FRUIT']
def add_ent(matcher, doc, i, matches):
# Get the current match and create tuple of entity label, start and end.
# Append entity to the doc's entity. (Don't overwrite doc.ents!)
match_id, start, end = matches[i]
doc.ents += ((FRUIT, start, end),)
matcher = Matcher(nlp.vocab)
pattern = [{'LOWER': 'apples'}]
matcher.add('AddApple', add_ent, pattern)
matches = matcher(doc)
for ent in doc.ents:
print(ent.text, ent.label_)
答案 0 :(得分:3)
哦,好吧,我想我找到了解决方案。如果标签不存在,则必须将标签添加到nlp.vocab.strings:
nlp.vocab.strings.add('FRUIT')