给定一个括号内的解析,我可以将它转换为NLTK中的Tree对象:
>>> from nltk.tree import Tree
>>> s = '(ROOT (S (NP (NNP Europe)) (VP (VBZ is) (PP (IN in) (NP (DT the) (JJ same) (NNS trends)))) (. .)))'
>>> Tree.fromstring(s)
Tree('ROOT', [Tree('S', [Tree('NP', [Tree('NNP', ['Europe'])]), Tree('VP', [Tree('VBZ', ['is']), Tree('PP', [Tree('IN', ['in']), Tree('NP', [Tree('DT', ['the']), Tree('JJ', ['same']), Tree('NNS', ['trends'])])])]), Tree('.', ['.'])])])
但是当我试图遍历它时,我只能访问最顶层的树:
>>> for i in Tree.fromstring(s):
... print i
...
(S
(NP (NNP Europe))
(VP (VBZ is) (PP (IN in) (NP (DT the) (JJ same) (NNS trends))))
(. .))
>>> for i in Tree.fromstring(s):
... print i, i.label()
...
(S
(NP (NNP Europe))
(VP (VBZ is) (PP (IN in) (NP (DT the) (JJ same) (NNS trends))))
(. .)) S
>>>
我可以按如下方式深入一级:
>>> for i in Tree.fromstring(s):
... print i.subtrees()
...
<generator object subtrees at 0x7f1eb1571410>
>>> for i in Tree.fromstring(s):
... for j in i.subtrees():
... print j
...
(S
(NP (NNP Europe))
(VP (VBZ is) (PP (IN in) (NP (DT the) (JJ same) (NNS trends))))
(. .))
(NP (NNP Europe))
(NNP Europe)
(VP (VBZ is) (PP (IN in) (NP (DT the) (JJ same) (NNS trends))))
(VBZ is)
(PP (IN in) (NP (DT the) (JJ same) (NNS trends)))
(IN in)
(NP (DT the) (JJ same) (NNS trends))
(DT the)
(JJ same)
(NNS trends)
(. .)
但有没有办法以深度方式遍历所有子树?
如何在NLTK中遍历树?
如何遍历NLTK中的所有子树?
答案 0 :(得分:14)
也许我忽略了一些事情,但这就是你所追求的事情吗?
import nltk
s = '(ROOT (S (NP (NNP Europe)) (VP (VBZ is) (PP (IN in) (NP (DT the) (JJ same) (NNS trends)))) (. .)))'
tree = nltk.tree.Tree.fromstring(s)
def traverse_tree(tree):
# print("tree:", tree)
for subtree in tree:
if type(subtree) == nltk.tree.Tree:
traverse_tree(subtree)
traverse_tree(tree)
它首先遍历您的树深度。