我正在尝试将coreNLP opneIE用于其他人类语言,如中文(默认英语模型没问题,中文模型是从StanfordNLP页面获得的最新版本)。我的代码在这里(来自Stanford NLP doc page的例子)
public class Test {
public static void main(String[] argv){
try{
// Create the Stanford CoreNLP pipeline
Properties props = new Properties();
//load properties file
InputStream in =new FileInputStream(new File("/properties/chinese.properties"));
//load file stream to properties
props.load(in);
in.close();
//initialize pipeline(problem occurred here)
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
System.out.println("Load successfully!");
// Annotate an example document.
//Annotation doc = new Annotation("Obama was born in Hawaii. He is our president.");
Annotation doc = new Annotation("克林顿说,华盛顿将逐步落实对韩国的经济援助。");
pipeline.annotate(doc);
for (CoreMap sentence : doc.get(CoreAnnotations.SentencesAnnotation.class)) {
// Get the OpenIE triples for the sentence
Collection<RelationTriple> triples = sentence.get(NaturalLogicAnnotations.RelationTriplesAnnotation.class);
// Print the triples
for (RelationTriple triple : triples) {
System.out.println(triple.confidence + "\t" +
triple.subjectLemmaGloss() + "\t" +
triple.relationLemmaGloss() + "\t" +
triple.objectLemmaGloss());
}
}
}catch (IOException e){
e.printStackTrace();
}
}
但是当我初始化管道时,内存中的异常就像下面这样:
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator openie
[main] INFO edu.stanford.nlp.naturalli.ClauseSplitter - Loading clause splitter from edu/stanford/nlp/models/naturalli/clauseSearcherModel.ser.gz ... done [0.0073 seconds]
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1875)
at sun.misc.FloatingDecimal.parseDouble(FloatingDecimal.java:110)
at java.lang.Double.parseDouble(Double.java:538)
at edu.stanford.nlp.naturalli.NaturalLogicWeights.<init>(NaturalLogicWeights.java:59)
at edu.stanford.nlp.naturalli.OpenIE.<init>(OpenIE.java:210)
at edu.stanford.nlp.pipeline.AnnotatorImplementations.openie(AnnotatorImplementations.java:260)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$getNamedAnnotators$59(StanfordCoreNLP.java:513)
at edu.stanford.nlp.pipeline.StanfordCoreNLP$$Lambda$31/1156060786.apply(Unknown Source)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$getDefaultAnnotatorPool$65(StanfordCoreNLP.java:533)
at edu.stanford.nlp.pipeline.StanfordCoreNLP$$Lambda$37/2101842856.get(Unknown Source)
at edu.stanford.nlp.util.Lazy$3.compute(Lazy.java:118)
at edu.stanford.nlp.util.Lazy.get(Lazy.java:31)
at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:146)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:447)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:150)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:146)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:133)
通过其他问题的答案,我试图改变JVM的运行时内存(2GB-4GB)但没有效果。所以我怎么能解决这个问题,因为它让我感到困惑的是为什么当我改变语言模型时它发生但没有问题默认英语one.Thx家伙。