Question

我正在使用stanford的core-nlp管道来执行一些基本任务。以下是教程中的示例代码副本。

public static void testcoreNLP(String inputText) throws IOException {

  Properties props = new Properties();
  props.put("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref");
  props.put("coref.md.type", "rule");
  props.put("coref.mode", "statistical");
  props.put("coref.doClustering", "true");
  props.put("ner.model", "edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz,edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz,edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz");
  StanfordCoreNLP pipeline = new StanfordCoreNLP(props);

  StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
  Annotation document = new Annotation(inputText);

  pipeline.annotate(document);
  List<CoreMap> sentences = document.get(SentencesAnnotation.class);

  for(CoreMap sentence: sentences) {
        for (CoreLabel token: sentence.get(TokensAnnotation.class)) {
          String word = token.get(TextAnnotation.class);
          String pos = token.get(PartOfSpeechAnnotation.class);
          String ne = token.get(NamedEntityTagAnnotation.class);      
      System.out.println("word: " + word + " pos: " + pos + " ne:" + ne);
    }  
}

我的方法testcoreNLP（接受String inputText）正由其for-loop内的另一个方法（textPreprocessor（）预处理文本）调用。

据我了解，每次使用testcoreNLP方法时，都会加载所有模型文件（某些特定于域的模型文件），每次运行大约需要3-5秒。

如何从运行时拆分模型加载？

Answer 1

您的Java应用程序只需要构建一次管道。您可以从方法中获取管道加载代码，然后执行一次。

在core-nlp中拆分运行时和配置模式

1 个答案: