我试图通过CoreNLP使用Stanford NLP的SUTime代码:
AnnotationPipeline pipeline = new AnnotationPipeline();
pipeline.addAnnotator(new TimeAnnotator("sutime", props));
Annotation annotation = new Annotation("The interesting date is 4 days from today and it is 20th july of this year, another date is 18th Feb 1997");
annotation.set(CoreAnnotations.DocDateAnnotation.class, "2013-07-14");
pipeline.annotate(annotation);
List<CoreMap> timexAnnsAll = annotation.get(TimeAnnotations.TimexAnnotations.class);
然而,结果却抛出了这个例外:
Reading TokensRegex rules from edu/stanford/nlp/models/sutime/defs.sutime.txt
Reading TokensRegex rules from edu/stanford/nlp/models/sutime/english.sutime.txt
Reading TokensRegex rules from edu/stanford/nlp/models/sutime/english.holidays.sutime.txt
Exception in thread "main" java.lang.NullPointerException
at edu.stanford.nlp.ie.NumberNormalizer.findNumbers(NumberNormalizer.java:423)
at edu.stanford.nlp.ie.NumberNormalizer.findAndMergeNumbers(NumberNormalizer.java:721)
at edu.stanford.nlp.time.TimeExpressionExtractorImpl.extractTimeExpressions(TimeExpressionExtractorImpl.java:184)
at edu.stanford.nlp.time.TimeExpressionExtractorImpl.extractTimeExpressions(TimeExpressionExtractorImpl.java:178)
at edu.stanford.nlp.time.TimeExpressionExtractorImpl.extractTimeExpressionCoreMaps(TimeExpressionExtractorImpl.java:116)
at edu.stanford.nlp.time.TimeAnnotator.annotateSingleSentence(TimeAnnotator.java:240)
at edu.stanford.nlp.time.TimeAnnotator.annotate(TimeAnnotator.java:226)
at edu.stanford.nlp.pipeline.AnnotationPipeline.annotate(AnnotationPipeline.java:68)
at dd.stanford.main(stanford.java:74)
我正在使用maven的3.5.2版本。有没有人知道为什么这个例外?提前谢谢。
答案 0 :(得分:0)
由于您没有对文本进行标记化,因此崩溃了。
将您的代码更改为:
AnnotationPipeline pipeline = new AnnotationPipeline();
pipeline.addAnnotator(new TokenizerAnnotator(false));
pipeline.addAnnotator(new TimeAnnotator("sutime", props));