我正在尝试使用统计学参考系统来处理带有以下命令的文本文件
java -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,cleanxml,ssplit,pos,lemma,ner,parse,coref -file input.txt
这会抛出以下错误消息:
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.TokenizerAnnotator - TokenizerAnnotator: No tokenizer type provided. Defaulting to PTBTokenizer.
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator cleanxml
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator pos
Reading POS tagger model from edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger ... done [0.9 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [1.8 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [0.8 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [1.0 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse
[main] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ...
done [0.4 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator coref
Processing file /home/xilin/Toolkits/stanford-corenlp-full-2015-12-09/input.txt ... writing to /home/xilin/Toolkits/stanford-corenlp-full-2015-12-09/input.txt.out
Annotating file /home/xilin/Toolkits/stanford-corenlp-full-2015-12-09/input.txt
Exception in thread "main" java.lang.RuntimeException: Error annotating document with coref
at edu.stanford.nlp.scoref.StatisticalCorefSystem.annotate(StatisticalCorefSystem.java:86)
at edu.stanford.nlp.scoref.StatisticalCorefSystem.annotate(StatisticalCorefSystem.java:63)
at edu.stanford.nlp.pipeline.CorefAnnotator.annotate(CorefAnnotator.java:97)
at edu.stanford.nlp.pipeline.AnnotationPipeline.annotate(AnnotationPipeline.java:72)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.annotate(StanfordCoreNLP.java:534)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.annotate(StanfordCoreNLP.java:544)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.processFiles(StanfordCoreNLP.java:1098)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.processFiles(StanfordCoreNLP.java:877)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.run(StanfordCoreNLP.java:1187)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.main(StanfordCoreNLP.java:1257)
Caused by: java.lang.NullPointerException
at edu.stanford.nlp.hcoref.Preprocessor.assignMentionIDs(Preprocessor.java:170)
at edu.stanford.nlp.hcoref.Preprocessor.initializeMentions(Preprocessor.java:153)
at edu.stanford.nlp.hcoref.Preprocessor.preprocess(Preprocessor.java:64)
at edu.stanford.nlp.hcoref.CorefDocMaker.makeDocument(CorefDocMaker.java:194)
at edu.stanford.nlp.hcoref.CorefDocMaker.makeDocument(CorefDocMaker.java:154)
at edu.stanford.nlp.scoref.StatisticalCorefSystem.annotate(StatisticalCorefSystem.java:68)
... 9 more
如果我更改了选项" coref"在上述命令中,确定性共参照系统运行平稳。其他人指出这是3.6.0版本中的BUG。我正在使用github repository,我正在使用最新版本。但是这个bug似乎仍然存在。
答案 0 :(得分:4)
您需要在mention
之前加入coref
注释器。这显示为空指针异常的事实确实是一个错误。您使用哪种Git修订版?我们最近改变了处理需求的方式,这可能是一个残余的错误。