问题:
当我从eclipse中的Stanford CoreNLP教程中粘贴此示例代码时,我遇到了一堆错误 - 这段代码来自初学者的Stanford CoreNLP教程(`https://stanfordnlp.github.io/CoreNLP/api.html)。我不确定是什么问题 - 我导入了其他教程中提到的外部JAR文件,但我仍然遇到错误。 :
import edu.stanford.nlp.coref.data.*;
import edu.stanford.nlp.ling.*;
import edu.stanford.nlp.ie.util.*;
import edu.stanford.nlp.pipeline.*;
import edu.stanford.nlp.semgraph.*;
import edu.stanford.nlp.trees.*;
import java.util.*;
public class BasicPipelineExample{
public static String text = "Joe Smith was born in California. " +
"In 2017, he went to Paris, France in the summer. " +
"His flight left at 3:00pm on July 10th, 2017. " +
"After eating some escargot for the first time, Joe said,
\"That was delicious!\" " +
"He sent a postcard to his sister Jane Smith. " +
"After hearing about Joe's trip, Jane decided she might go to France one day.";
public static void main(String[] DEEPANSHA) throws InterruptedException
{
//set up pipeline properties
Properties props = new Properties();
// set the list of annotators to run
props.setProperty("annotators", "tokenize,ssplit,pos,lemma,ner,parse,depparse,coref,kbp,quote");
//set a property for an annotator, in this case the coref annotator is being set to use the neural algorithm
props.setProperty("coref.algorithm", "neural");
// build pipeline
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
// create a document object
CoreDocument document = new CoreDocument(text);
// annnotate the document
pipeline.annotate(document);
// text of the first sentence
String sentenceText = document.sentences().get(0).text();
System.out.println("Example: sentence");
System.out.println(sentenceText);
System.out.println();
}
}
显示的错误:
Adding annotator tokenize
No tokenizer type provided. Defaulting to PTBTokenizer.
Adding annotator ssplit
Adding annotator pos
Loading POS tagger from edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger ... done [2.2 sec].
Adding annotator lemma
Adding annotator ner
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [5.6 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [1.3 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [3.1 sec].
TokensRegexNERAnnotator ner.fine.regexner: Read 580641 unique entries out of 581790 from edu/stanford/nlp/models/kbp/regexner_caseless.tab, 0 TokensRegex patterns.
TokensRegexNERAnnotator ner.fine.regexner: Read 4857 unique entries out of 4868 from edu/stanford/nlp/models/kbp/regexner_cased.tab, 0 TokensRegex patterns.
TokensRegexNERAnnotator ner.fine.regexner: Read 585498 unique entries from 2 files
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at edu.stanford.nlp.ling.tokensregex.parser.TokenSequenceParser.<init>(TokenSequenceParser.java:3446)
at edu.stanford.nlp.ling.tokensregex.TokenSequencePattern.getNewEnv(TokenSequencePattern.java:158)
at edu.stanford.nlp.pipeline.TokensRegexNERAnnotator.createPatternMatcher(TokensRegexNERAnnotator.java:343)
at edu.stanford.nlp.pipeline.TokensRegexNERAnnotator.<init>(TokensRegexNERAnnotator.java:295)
at edu.stanford.nlp.pipeline.NERCombinerAnnotator.setUpFineGrainedNER(NERCombinerAnnotator.java:209)
at edu.stanford.nlp.pipeline.NERCombinerAnnotator.<init>(NERCombinerAnnotator.java:152)
at edu.stanford.nlp.pipeline.AnnotatorImplementations.ner(AnnotatorImplementations.java:68)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$getNamedAnnotators$44(StanfordCoreNLP.java:546)
at edu.stanford.nlp.pipeline.StanfordCoreNLP$$Lambda$14/501263526.apply(Unknown Source)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$null$69(StanfordCoreNLP.java:625)
at edu.stanford.nlp.pipeline.StanfordCoreNLP$$Lambda$36/277630005.get(Unknown Source)
at edu.stanford.nlp.util.Lazy$3.compute(Lazy.java:126)
at edu.stanford.nlp.util.Lazy.get(Lazy.java:31)
at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:149)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:495)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:201)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:194)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:181)
at BasicPipelineExample.main(BasicPipelineExample.java:29)
答案 0 :(得分:0)
将虚拟机堆大小增加到3或4GB,以获取完整的管道(pos,depparse,ner)。 在那之后它应该工作。 https://wiki.eclipse.org/FAQ_How_do_I_increase_the_heap_size_available_to_Eclipse%3F