使用Java中的POSTagger将不同的POS(词性)保存在不同的文件中?

时间:2017-04-17 11:26:28

标签: java nlp tokenize opennlp pos-tagger

我使用openNLP标记POS(词性)。

InputStream inputStream = new 
             FileInputStream("C:/en-pos-maxent.bin"); 
          POSModel model = new POSModel(inputStream);

          POSTaggerME tagger = new POSTaggerME(model);



    String sentence = "This is not a song for the broken-hearted" + 
            " No silent prayer for the faith-departed " + 
            " I am not gonna be just a face in the crowd " + 
            " You are gonna hear my voice " +
            " When I shout it out loud";

    String simple = "[.?!-]";      
      String[] splitString = (sentence.split(simple));     


      SimpleTokenizer simpleTokenizer = SimpleTokenizer.INSTANCE;

      //String tokens[] = simpleTokenizer.tokenize();



      for(int i = 0;i<splitString.length;i++)
      {
          String tokens[] = simpleTokenizer.tokenize(splitString[i]);


          String[] tags = tagger.tag(tokens);

          //POSSample sample = new POSSample(tokens, tags);

          /*for(String token : tokens) {         
              System.out.println(token);  
           }*/



         for(int j= 0;j < tags.length;j++)
          {
              if(tags[j].equals("DT"))
              {
                  //System.out.println(tokens[j]);

                  File file = new File("DT.txt");

                  try {

                      PrintWriter output = new PrintWriter(file);
                      output.println(tokens[j]);
                      output.close();

                } catch (Exception e) {
                    // TODO: handle exception
                }

当我在循环中使用println时,它会显示所需的值。但是当我要将其保存在文件名DT.txt中时。它只是在文本文件中保存一个值。 Text file output Printed Output in console

1 个答案:

答案 0 :(得分:1)

您正在为for循环的每次运行创建一个新文件。在for循环之外创建文件。