Question

我尝试了以下代码，但代码不起作用，只输出null。

String text = "我爱北京天安门。";
StanfordCoreNLP pipeline = new StanfordCoreNLP();
Annotation annotation = pipeline.process(text);
String result = annotation.get(CoreAnnotations.ChineseSegAnnotation.class);
System.out.println(result);

结果：

...
done [0.6 sec].
Using mention detector type: rule
null

如何正确使用StanfordNLP中文分段器？

Answer 1

一些示例代码：

import edu.stanford.nlp.pipeline.*;
import edu.stanford.nlp.ling.CoreAnnotations;
import edu.stanford.nlp.ling.CoreLabel;
import edu.stanford.nlp.util.StringUtils;

import java.util.*;

public class ChineseSegmenter {

    public static void main (String[] args) {
        // set the properties to the standard Chinese pipeline properties
        Properties props = StringUtils.argsToProperties("-props", "StanfordCoreNLP-chinese.properties");
        StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
        String text = "...";
        Annotation annotation = new Annotation(text);
        pipeline.annotate(annotation);
        List<CoreLabel> tokens = annotation.get(CoreAnnotations.TokensAnnotation.class);
        for (CoreLabel token : tokens)
            System.out.println(token);
    }
}

注意：确保中国模型jar在CLASSPATH上。该文件位于：http://stanfordnlp.github.io/CoreNLP/download.html

上面的代码应该打印出中文分段运行后创建的标记。

如何在Java中使用StanfordNLP中文分段器？

1 个答案: