我想在解析句子时检索子发辫,如下所示:
sentence = "All new medications must undergo testing before they can be
prescribed"
parser = stanford.StanfordParser()
tree_parse = parser.raw_parse(sentence)
for i, sub_tree in enumerate(tree_parse[0].subtrees()):
if sub_tree.label() in ["S"]:
sub_list = sub_tree
print(sub_list)
我期待的是单独访问标记为“S”的子树,如下所示:
(S
(NP (DT All) (JJ new) (NNS medications))
(VP
(MD must)
(VP
(VB undergo)
(S
(VP
(VBG testing)
(SBAR
(IN before)
(S
(NP (PRP they))
(VP (MD can) (VP (VB be) (VP (VBN prescribed)))))))))))
但实际输出如下:
(NP (DT All) (JJ new) (NNS medications))
(VP
(MD must)
(VP
(VB undergo)
(S
(VP
(VBG testing)
(SBAR
(IN before)
(S
(NP (PRP they))
(VP (MD can) (VP (VB be) (VP (VBN prescribed))))))))))
How to access the sub tress individually like accessing items in a list?
答案 0 :(得分:1)
您已经获得了子树:子树包含其根目录下的所有内容,因此您显示的输出被正确检索为"子树"低于顶级S
。然后,您的遗嘱将输出主导"测试,然后才能开出处方",最后输出最低S
,支配"它们可以被处方"。
顺便提一下,您可以通过指定filter直接获取S
子树:
for sub_tree in tree_parse[0].subtrees(lambda t: t.label() == "S"):
print(sub_tree)