Question

这个问题让我思考了很长一段时间，但我没有找到令人满意的解决方案。

好吧，现在我正在使用Python中的Stanford依赖解析器，下面的代码为我提供了这个输出。

phrase="If there is a moose in the oven, is there also an elephant?"
dependency_parser = StanfordDependencyParser(path_to_jar=path_to_jar, path_to_models_jar=path_to_models_jar)
test = dependency_parser.raw_parse(phrase)
dep= test.next()


list(dep.triples())

（（你是'，u'VBZ'），u'advcl'，（u'是'，u'VBZ'））

（（你是'，u'VBZ'），u'mark'，（u'If'，u'IN'））

（（你是'，u'VBZ'），u'expl'，（u'there'，u'EX'））

依旧......

但我真正需要的是一些表示，其中包括原始句子中的发生次数，因为最终的应用程序将包含多个相同单词出现的长句子。类似的东西：

标记（is-3，If-1）

提前感谢您对如何生成此类输出的想法！

Answer 1

如果您使用Java服务器并使用Python客户端访问它，您可以在返回的JSON中获取令牌索引。

以下是启动Java Stanford CoreNLP服务器的信息：

https://stanfordnlp.github.io/CoreNLP/corenlp-server.html

我建议安装stanza Python模块并使用它为Stanford CoreNLP服务器提供的客户端。

有关安装和使用stanza的信息，请访问：

https://github.com/stanfordnlp/stanza

带有依赖项的返回JSON将具有标记的索引。

从stanford依赖解析器

1 个答案: