Question

我已经问过如何正确浏览NTLK树。

如何正确浏览NLTK树（或ParentedTree）？我想用父节点＆＃34; VBZ＆＃34;来识别某个叶子，然后我想从那里向上移动到树的左边以识别NP节点。

并提供以下图示：

NLTK tree

我从汤米那里得到了以下（非常有帮助）的回答（谢谢！）：

from nltk.tree import *

np_trees = []

def traverse(t):
    try:
        t.label()
    except AttributeError:
        return

    if t.label() == "VBZ":
        current = t
         while current.parent() is not None:

            while current.left_sibling() is not None:

                 if current.left_sibling().label() == "NP":
                    np_trees.append(current.left_sibling())

                current = current.left_sibling()

            current = current.parent()

    for child in t:
         traverse(child)

 tree = ParentedTree.fromstring("(S (NP (NNP)) (VP (VBZ) (NP (NNP))))")
 traverse(tree)
 print np_trees # [ParentedTree('NP', [ParentedTree('NNP', [])])]

但是如何包含我只提取那些具有NNP子节点的NP节点的条件？

任何帮助都会再次受到赞赏。

（一般来说，如果你们中间有NLTK树的专家，我很乐意和你聊聊并支付一些咖啡来换取一些洞察力。）

Answer 1

我通常将子树功能与过滤器结合使用。稍微更改树以显示它现在只选择NP中的一个：

>>> tree = ParentedTree.fromstring("(S (NP (NNP)) (VP (VBZ) (NP (NNS))))")
>>> for st in tree.subtrees(filter = lambda x: x.label() == "NP" and x[0].label() == 'NNP'):
...     print(st)
... 
(NP (NNP ))

但是，当您的子树/ x [0]没有标签时（例如，当它是终端时），这可能会崩溃。当NP完全为空时抛出IndexError。但我说这些情景不太可能。但是，很可能我在这里监督事情，你可能想要建立一些额外的支票......

导航NLTK树（后续）

1 个答案: