在Spacy中分块?

时间:2018-04-21 09:54:53

标签: python spacy

Spacy附带名词块,但如何基于标签创建自己的自定义块?

from spacy.en import English
nlp = English()
doc = nlp(u'The cat and the dog sleep in the basket near the door.')
for np in doc.noun_chunks:
np.text

如何使用标签创建用户定义的分块模式?

NP  noun phrase     DT+RB+JJ+NN + PR 
PP  prepositional phrase    TO+IN   
VP      verb phrase     RB+MD+VB    
ADVP    adverb phrase   RB  
ADJP    adjective phrase    CC+RB+JJ    
SBAR    subordinating conjunction   IN  

1 个答案:

答案 0 :(得分:0)

from __future__ import unicode_literals
import spacy,en_core_web_sm
import textacy
nlp = en_core_web_sm.load()
sentence = 'The author is writing a new book.'
pattern = r'<VERB>?<ADV>*<VERB>+'
doc = textacy.Doc(sentence, lang='en_core_web_sm')
lists = textacy.extract.pos_regex_matches(doc, pattern)
for list in lists:
    print(list.text)