我已从http://nlp.stanford.edu/software/corenlp.shtml#Download下载并安装了所需的jar文件。
我已经包含了五个jar文件
Satnford-postagger.jar
斯坦福-psotagger-3.3.1.jar
斯坦福-psotagger-3.3.1.jar-javadoc.jar
斯坦福-psotagger-3.3.1.jar-src.jar
斯坦福-corenlp-3.3.1.jar
,代码是
public class lemmafirst {
protected StanfordCoreNLP pipeline;
public lemmafirst() {
// Create StanfordCoreNLP object properties, with POS tagging
// (required for lemmatization), and lemmatization
Properties props;
props = new Properties();
props.put("annotators", "tokenize, ssplit, pos, lemma");
/*
* This is a pipeline that takes in a string and returns various analyzed linguistic forms.
* The String is tokenized via a tokenizer (such as PTBTokenizerAnnotator),
* and then other sequence model style annotation can be used to add things like lemmas,
* POS tags, and named entities. These are returned as a list of CoreLabels.
* Other analysis components build and store parse trees, dependency graphs, etc.
*
* This class is designed to apply multiple Annotators to an Annotation.
* The idea is that you first build up the pipeline by adding Annotators,
* and then you take the objects you wish to annotate and pass them in and
* get in return a fully annotated object.
*
* StanfordCoreNLP loads a lot of models, so you probably
* only want to do this once per execution
*/
***this.pipeline = new StanfordCoreNLP(props);***
}
我的问题在于创建一条pipline。
我得到的错误是:
Exception in thread "main" java.lang.RuntimeException: edu.stanford.nlp.io.RuntimeIOException: Unrecoverable error while loading a tagger model
at edu.stanford.nlp.pipeline.StanfordCoreNLP$4.create(StanfordCoreNLP.java:563)
at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:81)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:262)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:129)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:125)
at lemmafirst.<init>(lemmafirst.java:39)
at lemmafirst.main(lemmafirst.java:83)
Caused by: edu.stanford.nlp.io.RuntimeIOException: Unrecoverable error while loading a tagger model
at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:758)
at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:289)
at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:253)
at edu.stanford.nlp.pipeline.POSTaggerAnnotator.loadModel(POSTaggerAnnotator.java:88)
at edu.stanford.nlp.pipeline.POSTaggerAnnotator.<init>(POSTaggerAnnotator.java:76)
at edu.stanford.nlp.pipeline.StanfordCoreNLP$4.create(StanfordCoreNLP.java:561)
... 6 more
Caused by: java.io.IOException: Unable to resolve "edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger" as either class path, filename or URL
at edu.stanford.nlp.io.IOUtils.getInputStreamFromURLOrClasspathOrFileSystem(IOUtils.java:434)
at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:753)
... 11 more
任何人都可以请更正错误吗?谢谢
答案 0 :(得分:14)
抛出的异常是由于缺少pos模型。这是因为有可下载的版本有和没有模型文件。
你要么添加 斯坦福-postagger-的全强> -3.3.1.jar 可在以下页面找到(stanford-postagger-full-2014-01-04.zip): http://nlp.stanford.edu/software/tagger.shtml
或者你对整个CoreNLP包(stanford-corenlp- 完整 .... jar)做同样的事情: http://nlp.stanford.edu/software/corenlp.shtml (然后你也可以删除所有postagger依赖项,它们包含在CoreNLP中)
如果您只想添加模型文件,请查看Maven Central并下载“stanford-corenlp-3.3.1-models.jar”。
答案 1 :(得分:7)
添加这些模型文件的一种更简单的方法是在pom.xml中添加以下依赖项,让maven为您管理:
<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-corenlp</artifactId>
<version>3.6.0</version>
</dependency>
<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-corenlp</artifactId>
<version>3.6.0</version>
<classifier>models</classifier> <!-- will get the dependent model jars -->
</dependency>
答案 2 :(得分:0)
如果有人在寻找gradle依赖项,请在依赖项下添加以下内容。
compile group: 'edu.stanford.nlp', name: 'stanford-corenlp', version: '3.9.1'
compile group: 'edu.stanford.nlp', name: 'stanford-corenlp', version: '3.9.1', classifier: 'models'