Question

我在java-vert.x应用程序中使用stanford corenlp jar。依赖关系：

   <dependency>
        <groupId>edu.stanford.nlp</groupId>
        <artifactId>stanford-corenlp</artifactId>
        <version>3.8.0</version>
    </dependency>
    <dependency>
        <groupId>edu.stanford.nlp</groupId>
        <artifactId>stanford-corenlp</artifactId>
        <version>3.8.0</version>
        <classifier>models</classifier>
    </dependency>

我正在使用下面的注释

props.put("annotators", "tokenize, ssplit, pos, parse, sentiment, lemma, ner");

面对性能问题，因此尝试加载解析模型以下，因为根据文档，shift reduce解析器更快

props.setProperty("parse.model","edu/stanford/nlp/models/srparser/ englishSR.ser.gz");

我需要单独添加上面的依赖项，我看到下面的jar提到了大多数地方，但没有看到它的兼容或最新版本和maven依赖：

http://nlp.stanford.edu/software/stanford-srparser-2014-10-23-models.jar

请尽早协助：不确定是否将类型englishSR.ser.gz直接添加到类路径上是一个好主意，我暂时也是这样做的。

即使使用SR型号后，我也没有看到性能方面的改善，请您建议一下吗？我正在尝试处理从聊天机器人收到的文本的管道[我不是要处理文件] 代码段

Properties props = new Properties();
props.put("annotators", "tokenize, ssplit, pos, parse, sentiment, lemma, ner");  
props.setProperty("parse.model","edu/stanford/nlp/models/srparser/englishSR.ser.gz");
pipeline = new StanfordCoreNLP(props);
pipeline.process("this is my chat bot text");

Answer 1

关于您的模型问题：您可以尝试此解决方法。 GitHub版本似乎要求3.7.0模型，而准备使用的版本是3.8.0 ....转到https://github.com/stanfordnlp/CoreNLP，向下滚动到＆＃34;使用Maven构建＆＃34;并下载您需要的模型，并包含在您的课程路径中......

关于性能：考虑通过一次处理更多来避免昂贵的设置，或者考虑基于服务器的方法，如下所述：https://stanfordnlp.github.io/CoreNLP/corenlp-server.html

解析器的速度慢，需要stanford-srparser models.jar pom entry

1 个答案: