Question

就像spacy displacy一样，会在html中渲染实体突出显示。

import spacy
from spacy import displacy
nlp = spacy.load('en')
doc1 = nlp(u'This is a google sentence.')
doc2 = nlp(u'This is another sentence.')
html = displacy.render([doc1, doc2], style='ent', page=True)

如何突出显示给定文本中的所有动词。

from __future__ import unicode_literals
import spacy,en_core_web_sm
import textacy
nlp = en_core_web_sm.load()
sentence = 'The cat jumped quickly over the wall.'
doc = textacy.Doc(sentence, lang='en_core_web_sm')
for token in doc:
    if (token.pos_ == 'VERB'):
        print(token.text)

此处输出已跳转将以绿色突出显示！怎么样？

类似的东西

http://www.expresso-app.org/

Answer 1

您可以通过在manual=True或render()上设置serve()来使用显示界面突出显示自定义实体。这是一个简单的示例：

sentence = [{'text': 'The cat jumped quickly over the wall.',
    'ents': [{'start': 8, 'end':14, 'label': 'VERB'}],
    'title': None}]

displacy.render(sentence, style='ent', manual=True)

此外，要获取所需格式的数据，您可以进行依赖项解析，并在其上使用PhraseMatcher以获取start和end的值。

Answer 2

您可以使用带有参数 style='ent' 和 manual=True 的 displacy 的 render 或 serve 方法来突出显示自定义实体。

from spacy import displacy
import re

text = 'The quick brown fox jumps over the lazy dog'
match_phrases = ['brown fox','lazy dog']
matches = []

for match_phrase in match_phrases:
    for item in re.finditer(match_phrase,text):
        match = {}
        match['start'], match['end'] = item.span() 
        match['label'] = '\u2713' # The tag/label that you would like to display
        matches.append(match)
        
sentence = [{
    'text': text,
    'ents': matches
}]

displacy.render(sentence, style='ent', manual=True)

Spacy动词突出显示？

2 个答案: