Question

我通过执行以下操作在python中使用stanford解析器：

import os
sentence = "Did Matt win the men slalom?"
os.popen("echo '"+sentence+"' > ~/stanfordtemp.txt")
parser_out = os.popen("~/stanford-parser-2012-11-12/lexparser.sh  
  ~/stanfordtemp.txt").readlines()

for tree in parser_out:
    print tree

但是，我不知道如何访问解析器返回的树的叶子。你可以帮助我吗？我还必须编写一个能够从英语句子生成sql查询的代码。关于这个的任何提示？任何帮助都感激不尽。我也在使用nltk进行所有操作。

Answer 1

这是构建树然后递归构建叶子列表的示例。示例文本来自the online standford parser。

# class for tree nodes
class Node:
    def __init__(self,start):
        self.start = start
        self.children = []
        self.text = ''

# make a tree        
def make_tree(s):
    stack = []
    nodes = []
    cur = None
    root = None    

    for i, c in enumerate(s):
        if c == '(':
            cur = Node(i)
            if stack:
                stack[-1].children.append(cur)
            stack.append(cur)

            if root is None:
                root = cur

        elif c == ')' and stack:
            topnode = stack.pop()

            text = s[topnode.start + 1: i]
            topnode.text = text

    return root

# list of leaves
def list_of_leaves(node):
    result = []
    for child in node.children:
        result.extend(list_of_leaves(child))
    if not result:
        return [node]

    return result

s = """(ROOT
  (SQ (VBD Did)
    (NP (NNP Matt))
    (VP (VB win)
      (NP (DT the) (NNS men) (NN slalom)))
    (. ?)))"""

root = make_tree(s)    

for node in list_of_leaves(root):
    print node.text

Answer 2

如何用句子提取单个子句作为子树？因此，每当子句开始（S，SBAR，SBARQ等）时，提取为子树直到遇到另一个子句。对于最后一句，它直到句末。

以下是一个例子：

（ROOT （S （S （NP（NNP John））（副总裁（VBZ生活）（PP（IN in）（NP（新NNP新）（NNP约克）（NN市）））））（，）（CC但是）（S （SBAR （WHADVP（每当WRB））（S （NP（PRP他））（VP（VBZ旅行）（S （VP（TO to）（副总裁（VB工作）））））））（，）（NP（PRP他））（VP（VBZ旅行）（ADVP（RB非常）（RB远））（PP（TO to）（NP（PRP $ his）（NN工作）（NN地点）））））（。））））

如何通过python中的stanford解析器到达树生成器的叶子？

2 个答案: