如何在NLTK中为'wordsAndTags,penn,typedDependencies'重新格式化nltk.parse.stanford的输出,如下例所示?

时间:2017-04-23 23:50:10

标签: python parsing nltk

In [1]: from nltk.parse.stanford import StanfordParse
In [3]: english_parser = StanfordParser(‘stanford-parser.jar’, ‘stanford-parser-3.4-models.jar’)
In [4]: english_parser.raw_parse_sents((“this is the english parser test”, 
    “the parser is from stanford parser”))
Out[4]:
[[u’this/DT is/VBZ the/DT english/JJ parser/NN test/NN’],
[u'(ROOT’,    
u’ (S’,
u’ (NP (DT this))’,
u’ (VP (VBZ is)’,
u’ (NP (DT the) (JJ english) (NN parser) (NN test)))))’],
[u’nsubj(test-6, this-1)’,
u’cop(test-6, is-2)’,    
u’det(test-6, the-3)’,    
u’amod(test-6, english-4)’,
u’nn(test-6, parser-5)’,
u’root(ROOT-0, test-6)’],
[u’the/DT parser/NN is/VBZ from/IN stanford/JJ parser/NN’],
[u'(ROOT’,
u’ (S’,
u’ (NP (DT the) (NN parser))’,
u’ (VP (VBZ is)’,
u’ (PP (IN from)’,
u’ (NP (JJ stanford) (NN parser))))))’],
[u’det(parser-2, the-1)’,
u’nsubj(is-3, parser-2)’,
u’root(ROOT-0, is-3)’,
u’amod(parser-6, stanford-5)’,
u’prep_from(is-3, parser-6)’]]

0 个答案:

没有答案