斯坦福CoreNLP - egw4-reut.512.clusters无法找到

时间:2015-01-23 17:15:07

标签: java stanford-nlp

我正在使用CoreNLP包对用户注释做一些注释,因为我已经升级到3.5.0版本,我似乎反复遇到同样的错误:

  

从edu / stanford / nlp / models / ner / english.all.3class.distsim.crf.ser.gz加载分类器......

     

从/u/nlp/data/pos_tags_are_useless/egw4-reut.512.clusters ...

加载distsim词典      

java.lang.RuntimeException:java.io.FileNotFoundException:\ u \ nlp \ data \ pos_tags_are_useless \ egw4-reut.512.clusters(系统找不到指定的路径)       在edu.stanford.nlp.objectbank.ReaderIteratorFactory $ ReaderIterator.setNextObject(ReaderIteratorFactory.java:225)(提示五十行错误)

这里的一些搜索给我带来了类似的问题:

Stanford NER Error: Loading distsim lexicon FailedStanford NER tagger generates 'file not found' exception with provided models没有解决我的问题:我只使用3.5.0中的代码和模型(通过Maven Central)。我尝试从NER模型修改props文件并指向用户目录中的另一个.clusters文件但没有成功(完全相同的错误)。

我用来实例化CoreNLP对象的代码也非常标准,但它是:

    Properties props = new Properties();
    props.setProperty("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref");
    stan = new StanfordCoreNLP(props);

现在我觉得有一些显而易见的东西让我失踪了。任何帮助将不胜感激。

更完整的堆栈跟踪(如果可以提供帮助)如下:

Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... Loading distsim lexicon from /u/nlp/data/pos_tags_are_useless/egw4-reut.512.clusters ... java.lang.RuntimeException: java.io.FileNotFoundException: \u\nlp\data\pos_tags_are_useless\egw4-reut.512.clusters (The system cannot find the path specified)
at edu.stanford.nlp.objectbank.ReaderIteratorFactory$ReaderIterator.setNextObject(ReaderIteratorFactory.java:225)
at edu.stanford.nlp.objectbank.ReaderIteratorFactory$ReaderIterator.<init>(ReaderIteratorFactory.java:161)
at edu.stanford.nlp.objectbank.ReaderIteratorFactory.iterator(ReaderIteratorFactory.java:98)
at edu.stanford.nlp.objectbank.ObjectBank$OBIterator.<init>(ObjectBank.java:404)
at edu.stanford.nlp.objectbank.ObjectBank.iterator(ObjectBank.java:242)
at edu.stanford.nlp.ie.NERFeatureFactory.initLexicon(NERFeatureFactory.java:471)
at edu.stanford.nlp.ie.NERFeatureFactory.init(NERFeatureFactory.java:379)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.reinit(AbstractSequenceClassifier.java:171)
at edu.stanford.nlp.ie.crf.CRFClassifier.loadClassifier(CRFClassifier.java:2630)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1620)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1675)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1662)
at edu.stanford.nlp.ie.crf.CRFClassifier.getClassifier(CRFClassifier.java:2851)
at edu.stanford.nlp.ie.ClassifierCombiner.loadClassifierFromPath(ClassifierCombiner.java:189)
at edu.stanford.nlp.ie.ClassifierCombiner.loadClassifiers(ClassifierCombiner.java:173)
at edu.stanford.nlp.ie.ClassifierCombiner.<init>(ClassifierCombiner.java:113)
at edu.stanford.nlp.ie.NERClassifierCombiner.<init>(NERClassifierCombiner.java:64)
at edu.stanford.nlp.pipeline.StanfordCoreNLP$6.create(StanfordCoreNLP.java:617)
at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:85)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:267)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:129)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:125)
at rgu.jclos.quilt.utilities.nlp.DependenciesTagger.<init>(DependenciesTagger.java:99)
at rgu.jclos.quilt.eca.approaches.ApproachC_USS.main(ApproachC_USS.java:47)
Caused by: java.io.FileNotFoundException: \u\nlp\data\pos_tags_are_useless\egw4-reut.512.clusters (The system cannot find the path specified)
    at java.io.FileInputStream.open(Native Method)
    at java.io.FileInputStream.<init>(FileInputStream.java:131)
    at edu.stanford.nlp.io.EncodingFileReader.<init>(EncodingFileReader.java:78)
    at edu.stanford.nlp.objectbank.ReaderIteratorFactory$ReaderIterator.setNextObject(ReaderIteratorFactory.java:192)
    ... 23 more
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... Exception in thread "main" edu.stanford.nlp.io.RuntimeIOException: java.io.FileNotFoundException
    at edu.stanford.nlp.pipeline.StanfordCoreNLP$6.create(StanfordCoreNLP.java:621)
    at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:85)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:267)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:129)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:125)
    at rgu.jclos.quilt.utilities.nlp.DependenciesTagger.<init>(DependenciesTagger.java:99)
    at rgu.jclos.quilt.eca.approaches.ApproachC_USS.main(ApproachC_USS.java:47)
Caused by: java.io.FileNotFoundException
    at edu.stanford.nlp.ie.ClassifierCombiner.loadClassifierFromPath(ClassifierCombiner.java:199)
    at edu.stanford.nlp.ie.ClassifierCombiner.loadClassifiers(ClassifierCombiner.java:173)
    at edu.stanford.nlp.ie.ClassifierCombiner.<init>(ClassifierCombiner.java:113)
    at edu.stanford.nlp.ie.NERClassifierCombiner.<init>(NERClassifierCombiner.java:64)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP$6.create(StanfordCoreNLP.java:617)
    ... 6 more
Caused by: java.lang.ClassCastException: java.util.ArrayList cannot be cast to edu.stanford.nlp.classify.LinearClassifier
at edu.stanford.nlp.ie.ner.CMMClassifier.loadClassifier(CMMClassifier.java:1070)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1620)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1675)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1662)
at edu.stanford.nlp.ie.ner.CMMClassifier.getClassifier(CMMClassifier.java:1116)
at edu.stanford.nlp.ie.ClassifierCombiner.loadClassifierFromPath(ClassifierCombiner.java:195)
... 10 more

1 个答案:

答案 0 :(得分:0)

事实证明问题是由Maven造成的。

我的代码位于一个实用程序库中,该实用程序库包裹着斯坦福CoreNLP以提供额外的处理,并且它本身运行良好。但是,当将此项目作为依赖项添加到我的主项目时,Maven默认将Stanford CoreNLP库模型的3.4版本导入主项目,这导致了上述错误。