我目前正在尝试使用CoreNLP训练我自己的中文NER模型,但是当执行训练命令时,我得到了FileNotFoundException。我已经看到有关此错误的帖子已在CoreNLP 3.5.0中修复,但是我正在使用4.1.0,并且仍在发生。
Exception in thread "main" edu.stanford.nlp.io.RuntimeIOException: java.io.FileNotFoundException: \u\nlp\data\chinese\distsim\xin_cmn_200907-201012.ldc.seg.utf8.c1000 (The system cannot find the path specified)
at edu.stanford.nlp.io.IOUtils.inputStreamFromFile(IOUtils.java:523)
at edu.stanford.nlp.io.IOUtils.readerFromFile(IOUtils.java:558)
at edu.stanford.nlp.objectbank.ReaderIteratorFactory$ReaderIterator.setNextObject(ReaderIteratorFactory.java:189)
at edu.stanford.nlp.objectbank.ReaderIteratorFactory$ReaderIterator.<init>(ReaderIteratorFactory.java:161)
at edu.stanford.nlp.objectbank.ReaderIteratorFactory.iterator(ReaderIteratorFactory.java:98)
at edu.stanford.nlp.objectbank.ObjectBank$OBIterator.<init>(ObjectBank.java:411)
at edu.stanford.nlp.objectbank.ObjectBank.iterator(ObjectBank.java:250)
at edu.stanford.nlp.ie.NERFeatureFactory.initLexicon(NERFeatureFactory.java:588)
at edu.stanford.nlp.ie.NERFeatureFactory.init(NERFeatureFactory.java:389)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.reinit(AbstractSequenceClassifier.java:210)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.<init>(AbstractSequenceClassifier.java:190)
at edu.stanford.nlp.ie.crf.CRFClassifier.<init>(CRFClassifier.java:181)
at edu.stanford.nlp.ie.crf.CRFClassifier.chooseCRFClassifier(CRFClassifier.java:2919)
at edu.stanford.nlp.ie.crf.CRFClassifier.main(CRFClassifier.java:2930)
Caused by: java.io.FileNotFoundException: \u\nlp\data\chinese\distsim\xin_cmn_200907-201012.ldc.seg.utf8.c1000 (The system cannot find the path specified)
at java.base/java.io.FileInputStream.open0(Native Method)
at java.base/java.io.FileInputStream.open(FileInputStream.java:212)
at java.base/java.io.FileInputStream.<init>(FileInputStream.java:154)
at edu.stanford.nlp.io.IOUtils.inputStreamFromFile(IOUtils.java:516)
... 13 more
答案 0 :(得分:0)
该文件未由我们公开分发。我可以问是否允许我们分享。同时,您需要将训练属性更改为设置useDistSim = false
才能避免此错误。