我通过执行以下操作在python中使用stanford解析器:
import os
sentence = "Did Matt win the men slalom?"
os.popen("echo '"+sentence+"' > ~/stanfordtemp.txt")
parser_out = os.popen("~/stanford-parser-2012-11-12/lexparser.sh
~/stanfordtemp.txt").readlines()
for tree in parser_out:
print tree
但是,我不知道如何访问解析器返回的树的叶子。你可以帮助我吗?我还必须编写一个能够从英语句子生成sql查询的代码。关于这个的任何提示?任何帮助都感激不尽。我也在使用nltk进行所有操作。
答案 0 :(得分:1)
这是构建树然后递归构建叶子列表的示例。示例文本来自the online standford parser。
# class for tree nodes
class Node:
def __init__(self,start):
self.start = start
self.children = []
self.text = ''
# make a tree
def make_tree(s):
stack = []
nodes = []
cur = None
root = None
for i, c in enumerate(s):
if c == '(':
cur = Node(i)
if stack:
stack[-1].children.append(cur)
stack.append(cur)
if root is None:
root = cur
elif c == ')' and stack:
topnode = stack.pop()
text = s[topnode.start + 1: i]
topnode.text = text
return root
# list of leaves
def list_of_leaves(node):
result = []
for child in node.children:
result.extend(list_of_leaves(child))
if not result:
return [node]
return result
s = """(ROOT
(SQ (VBD Did)
(NP (NNP Matt))
(VP (VB win)
(NP (DT the) (NNS men) (NN slalom)))
(. ?)))"""
root = make_tree(s)
for node in list_of_leaves(root):
print node.text
答案 1 :(得分:0)
如何用句子提取单个子句作为子树?因此,每当子句开始(S,SBAR,SBARQ等)时,提取为子树直到遇到另一个子句。对于最后一句,它直到句末。
以下是一个例子:
(ROOT (S (S (NP(NNP John)) (副总裁(VBZ生活) (PP(IN in) (NP(新NNP新)(NNP约克)(NN市))))) (,) (CC但是) (S (SBAR (WHADVP(每当WRB)) (S (NP(PRP他)) (VP(VBZ旅行) (S (VP(TO to) (副总裁(VB工作))))))) (,) (NP(PRP他)) (VP(VBZ旅行) (ADVP(RB非常)(RB远)) (PP(TO to) (NP(PRP $ his)(NN工作)(NN地点))))) (。))))