(ROOT
(S
(NP (NNP Barack) (NNP Obama))
(VP
(VBZ is)
(NP
(NP (DT the) (JJ 44th) (CC and) (JJ current) (NN president))
(PP (IN of) (NP (DT the) (NNP USA)))))
(. .)))
如何获取此子树值,因为它是顶部的NP树?
(NP (NNP Barack) (NNP Obama))
答案 0 :(得分:0)
您可以遍历树并检查isinstance(child, Tree)
和child.label()=="NP"
<强>代码:强>
from nltk import Tree
from nltk.util import breadth_first
parse_str = "(ROOT (S (NP (NNP Barack) (NNP Obama)) (VP (VBZ is) (NP (NP (DT the) (JJ 44th) (CC and) (JJ current) (NN president)) (PP (IN of) (NP (DT the) (NNP USA))))) (. .)))"
t = Tree.fromstring(parse_str)
#alltrees = []
def getsubtrees(t):
subtrees = []
for subt in t:
for child in subt:
if isinstance(child, Tree) and child.label()=="NP":
print child
break # Remove break to print all NPs
if isinstance(child, Tree):
subtrees.append(child)
#alltrees.append(subtrees)
if(len(subtrees)>1):
getsubtrees(subtrees)
getsubtrees(t)
#print alltrees
<强>输出:强>
(NP (NNP Barack) (NNP Obama))