Question

我使用以下方法对句子进行了分块：

grammar = '''                                                                                                              
    NP:                                                                                                                    
       {<DT>*(<NN.*>|<JJ.*>)*<NN.*>}                                                                                       
     NVN:                                                                                                                  
       {<NP><VB.*><NP>}                                                                                                    
    '''
chunker = nltk.chunk.RegexpParser(grammar)
tree = chunker.parse(tagged)
print tree

结果如下：

(S
  (NVN
    (NP The_Pigs/NNS)
    are/VBP
    (NP a/DT Bristol-based/JJ punk/NN rock/NN band/NN))
  that/WDT
  formed/VBN
  in/IN
  1977/CD
  ./.)

但是现在我一直在试图弄清楚如何导航。我希望能够找到NVN子树，并访问左侧名词短语（“The_Pigs”），动词（“are”）和右侧名词短语（“基于布里斯托尔的朋克摇滚乐队”）。我该怎么做？

Answer 1

尝试：

ROOT = 'ROOT'
tree = ...
def getNodes(parent):
    for node in parent:
        if type(node) is nltk.Tree:
            if node.label() == ROOT:
                print "======== Sentence ========="
                print "Sentence:", " ".join(node.leaves())
            else:
                print "Label:", node.label()
                print "Leaves:", node.leaves()

            getNodes(node)
        else:
            print "Word:", node

getNodes(tree)

Answer 2

当然，您可以编写自己的深度优先搜索...但有一种更简单（更好）的方式。如果您希望每个子树都以NVM为根，请使用Tree的子树方法和定义的过滤器参数。

>>> print t
(S
    (NVN
        (NP The_Pigs/NNS)
        are/VBP
        (NP a/DT Bristol-based/JJ punk/NN rock/NN band/NN))
    that/WDT
    formed/VBN
    in/IN
    1977/CD
    ./.)
>>> for i in t.subtrees(filter=lambda x: x.node == 'NVN'):
...     print i
... 
(NVN
    (NP The_Pigs/NNS)
    are/VBP
    (NP a/DT Bristol-based/JJ punk/NN rock/NN band/NN))

Answer 3

试试这个：

for a in tree:
        if type(a) is nltk.Tree:
            if a.node == 'NVN': # This climbs into your NVN tree
                for b in a:
                    if type(b) is nltk.Tree and b.node == 'NP':
                        print b.leaves() # This outputs your "NP"
                    else:
                        print b # This outputs your "VB.*"

输出：

[（'The_Pigs'，'NNS'）]

（'是'，'VBP'）

[（'a'，'DT'），（'Bristol-based'，'JJ'），（'punk'，'NN'），（'rock'，'NN'），（'band' ，'NN'）]

Answer 4

这是一个代码示例，用于生成带有标签的所有子树＆＃39; NP＆＃39;

def filt(x):
    return x.label()=='NP'

for subtree in t.subtrees(filter =  filt): # Generate all subtrees
    print subtree

对于兄弟姐妹，你可能想看看方法ParentedTree.left_siblings()

有关详细信息，请参阅以下链接。

http://www.nltk.org/howto/tree.html #some基本用法和示例 http://nbviewer.ipython.org/github/gmonce/nltk_parsing/blob/master/1.%20NLTK%20Syntax%20Trees.ipynb #a notebook使用这些方法

http://www.nltk.org/_modules/nltk/tree.html #all api with source

如何导航nltk.tree.Tree？

4 个答案: