我可以在python中使用NLTK从Spacy Dependency树中找到主题吗?

时间:2016-09-05 02:54:03

标签: python nlp spacy

我希望使用Spacy从句子中找到主题。下面的代码工作正常,并提供依赖树

import spacy
from nltk import Tree

en_nlp = spacy.load('en')

doc = en_nlp("The quick brown fox jumps over the lazy dog.")

def to_nltk_tree(node):
    if node.n_lefts + node.n_rights > 0:
        return Tree(node.orth_, [to_nltk_tree(child) for child in node.children])
    else:
        return node.orth_


[to_nltk_tree(sent.root).pretty_print() for sent in doc.sents]

enter image description here

从这个依赖树代码中,我可以找到这句话的主题吗?

3 个答案:

答案 0 :(得分:10)

我不确定您是否要使用nltk解析树编写代码(请参阅How to identify the subject of a sentence?)。但是,spacy也使用word.dep_属性的'nsubj'标签生成它。

import spacy
from nltk import Tree

en_nlp = spacy.load('en')

doc = en_nlp("The quick brown fox jumps over the lazy dog.")

sentence = next(doc.sents) 
for word in sentence:
...     print "%s:%s" % (word,word.dep_)
... 
The:det
quick:amod
brown:amod
fox:nsubj
jumps:ROOT
over:prep
the:det
lazy:amod
dog:pobj

提醒可能存在多个复杂情况。

>>> doc2 = en_nlp(u'When we study hard, we usually do well.')
>>> sentence2 = next(doc2.sents)
>>> for word in sentence2:
...     print "%s:%s" %(word,word.dep_)
... 
When:advmod
we:nsubj
study:advcl
hard:advmod
,:punct
we:nsubj
usually:advmod
do:ROOT
well:advmod
.:punct

答案 1 :(得分:0)

与leavesof3一样,我更喜欢将spaCy用于这种目的。它具有更好的可视化效果,即

enter image description here

主题将是具有依赖项属性“ nsubj”或“正常主题”的单词或短语(如果使用名词组块)

You can access displaCy (spaCy visualization) demo here

答案 2 :(得分:0)

尝试一下:

import spacy
import en_core_web_sm
nlp = spacy.load('en_core_web_sm')
sent = "I need to be able to log into the Equitable siteI tried my username and password from the AXA Equitable site which worked fine yesterday but it won't allow me to log in and when I try to change my password it says my answer is incorrect for the secret question I just need to be able to log into the Equitable site"
nlp_doc=nlp(sent)
subject = [tok for tok in nlp_doc if (tok.dep_ == "nsubj") ]
print(subject)