stanford-nlp - 如何使用API将标记化的结果写入txt文件 - Thinbug

如何使用API将标记化的结果写入txt文件

时间：2015-11-22 11:59:09

标签： stanford-nlp sentiment-analysis

我已经找到了一些显示结果的代码，但我不知道如何将它们写入txt文件，因为HasWord类型无法写入文件或转换为String类型。

DocumentPreprocessor dp = new DocumentPreprocessor("myorigin.txt");
for (List<HasWord> sentence : dp) {
    int i;
    for(i=0;i<sentence.size();i++){
        HasWord S= sentence.get(i);
        System.out.print(sentence.get(i));
        System.out.print('|');
    }
    System.out.println(' ');
}

1 个答案:

答案 0 :(得分：0)

import java.nio.file.Paths;
import java.nio.file.Files;
import java.nio.charset.StandardCharsets;

import java.io.*;
import java.io.PrintWriter;
import java.util.*;
import java.nio.file.Paths;
import java.nio.file.Files;
import java.nio.charset.StandardCharsets;
import edu.stanford.nlp.io.*;
import edu.stanford.nlp.ling.*;
import edu.stanford.nlp.pipeline.*;
import edu.stanford.nlp.ling.CoreAnnotations.*;
import edu.stanford.nlp.util.*;

public class TokensWriterExample {

    public static void main (String[] args) throws IOException {
        String text = new String(Files.readAllBytes(Paths.get(args[0])), StandardCharsets.UTF_8);
        Annotation document = new Annotation(text);
        Properties props = new Properties();
        props.setProperty("annotators", "tokenize");
        StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
        pipeline.annotate(document);
        PrintWriter writer = new PrintWriter("output_file.txt", "UTF-8");
        for (CoreLabel cl : document.get(CoreAnnotations.TokensAnnotation.class)) {
            writer.println(cl);
        }
        writer.close();
    }

}

确保从这里获得Stanford CoreNLP 3.5.2：http://nlp.stanford.edu/software/corenlp.shtml