Question

我是stanford Core NLP的新手。我想用它来分割英文，德文，法文文本中的句子。哪个班级有效？提前感谢。

Answer 1

对于处理此问题的较低级别的类，您可以查看tokenizer documentation。在CoreNLP级别，您可以使用Annotator的“tokenize，ssplit”。

Answer 2

您是否查看了main Stanford NLP page上的文档？大约一半的时间，它提供了一个几乎与您正在寻找的确切事物的例子。这个例子不仅分裂了句子，还分裂了单词。

Answer 3

为什么不使用BreakIterator包中的java.text来分割句子，行，单词，字符......等等

请参阅此链接：

http://docs.oracle.com/javase/6/docs/api/java/text/BreakIterator.html

Answer 4

    Properties properties = new Properties();
    properties.setProperty("annotators", "tokenize, ssplit, parse");
    StanfordCoreNLP pipeline = new StanfordCoreNLP(properties);
    List<CoreMap> sentences = pipeline.process(SENTENCES)
    .get(CoreAnnotations.SentencesAnnotation.class);    
    // I just gave a String constant which contains sentences.
    for (CoreMap sentence : sentences) {
            System.out.println(sentence.toString());
    }

stanford Core NLP：从文本中分割句子

4 个答案: