我使用Apache Spark大规模运行CoreNLP 3.8.0的神经依赖解析器。我用于解析的CoreNLP管道是这样的:
public static StanfordCoreNLP StanfordDepNNParser(){
Properties props = new Properties();
props.put("language", "english");
props.setProperty("annotators", "tokenize, ssplit, pos, lemma, ner, depparse");
props.put("depparse.model", "edu/stanford/nlp/models/parser/nndep/english_SD.gz");
props.put("parse.originalDependencies", true);
return new StanfordCoreNLP(props);
}
然而,过了一会儿,我收到以下错误:
18/06/08 08:16:40 WARN scheduler.TaskSetManager: Lost task 136.0 in stage 0.0 (TID 140, server.com): java.lang.StackOverflowError
at java.util.HashMap.putVal(HashMap.java:630)
at java.util.HashMap.put(HashMap.java:611)
at java.util.HashSet.add(HashSet.java:219)
at edu.stanford.nlp.semgraph.SemanticGraph.toCompactStringHelper(SemanticGraph.java:1526)
at edu.stanford.nlp.semgraph.SemanticGraph.toCompactStringHelper(SemanticGraph.java:1541)
at edu.stanford.nlp.semgraph.SemanticGraph.toCompactStringHelper(SemanticGraph.java:1541)
at edu.stanford.nlp.semgraph.SemanticGraph.toCompactStringHelper(SemanticGraph.java:1541)
at edu.stanford.nlp.semgraph.SemanticGraph.toCompactStringHelper(SemanticGraph.java:1541)
at edu.stanford.nlp.semgraph.SemanticGraph.toCompactStringHelper(SemanticGraph.java:1541)
at edu.stanford.nlp.semgraph.SemanticGraph.toCompactStringHelper(SemanticGraph.java:1541)
at edu.stanford.nlp.semgraph.SemanticGraph.toCompactStringHelper(SemanticGraph.java:1541)
at edu.stanford.nlp.semgraph.SemanticGraph.toCompactStringHelper(SemanticGraph.java:1541)
at edu.stanford.nlp.semgraph.SemanticGraph.toCompactStringHelper(SemanticGraph.java:1541)
at edu.stanford.nlp.semgraph.SemanticGraph.toCompactStringHelper(SemanticGraph.java:1541)
at edu.stanford.nlp.semgraph.SemanticGraph.toCompactStringHelper(SemanticGraph.java:1541)
. . . .
不幸的是,我无法找到触发此异常的确切句子(或文档)。是否可以预先进行某种检查,只是跳过这句话?