如何通过使用BFS获取第一个NP值来遍历树? NLTK

时间:2016-09-05 04:13:16

标签: tree nltk

(ROOT
  (S
    (NP (NNP Barack) (NNP Obama))
    (VP
      (VBZ is)
      (NP
        (NP (DT the) (JJ 44th) (CC and) (JJ current) (NN president))
        (PP (IN of) (NP (DT the) (NNP USA)))))
    (. .)))

如何获取此子树值,因为它是顶部的NP树?

(NP (NNP Barack) (NNP Obama))

1 个答案:

答案 0 :(得分:0)

您可以遍历树并检查isinstance(child, Tree)child.label()=="NP"

<强>代码:

from nltk import Tree
from nltk.util import breadth_first

parse_str = "(ROOT (S (NP (NNP Barack) (NNP Obama)) (VP (VBZ is) (NP (NP (DT the) (JJ 44th) (CC and) (JJ current) (NN president)) (PP (IN of) (NP (DT the) (NNP USA))))) (. .)))"
t = Tree.fromstring(parse_str)

#alltrees = []

def getsubtrees(t):
    subtrees = []
    for subt in t:
        for child in subt:
            if isinstance(child, Tree) and child.label()=="NP":
                print child
                break    # Remove break to print all NPs

            if isinstance(child, Tree):
                subtrees.append(child)

    #alltrees.append(subtrees)
    if(len(subtrees)>1):
        getsubtrees(subtrees)

getsubtrees(t)
#print alltrees

<强>输出:

(NP (NNP Barack) (NNP Obama))