Java:计算特定单词的频率并写入文件

时间:2016-05-24 13:31:08

标签: java text frequency

我想读取文件夹中的所有文本文件并计算每个文件中的特定单词,然后将单词的频率写入txt文件。这是我的代码:

public static void main(String[] args) throws IOException{

    String[] array = new String[]{"beautiful", "good", "extraordinary", "wonderful", "like",
            "proud","brilliant","great","well", "perfect"};
    int[] wordCount = new int[]{0,0,0,0,0,0,0,0,0,0};

    File path = new File("development/text");

    for(File file: path.listFiles()){

        PrintWriter writer = new PrintWriter(new FileWriter("result.txt",true));            
        FileInputStream fin =  new FileInputStream(file);
        Scanner s = new Scanner(fin);

        for(int i = 0; i < array.length; i++)
        {
                        wordCounter = 0;
                        while (s.hasNext()) 
                        {
                             if (s.next().equals(array[i])) {
                                 wordCount[i]++;
                             }
                        }                        
        }

        writer.println(wordCount[0] + "," + wordCount[1] + "," + wordCount[2] + "," + 
                wordCount[3] + "," + wordCount[4] + "," +wordCount[5] + "," +wordCount[6] 
                + "," +wordCount[7] + "," + wordCount[8] + "," + wordCount[9]);
        fin.close();
        s.close();
        writer.close();
    }
}

但是,我的代码只读取数组的第一个元素(漂亮)。其他元素输出为0,尽管它们出现在文本文件中。

3 个答案:

答案 0 :(得分:1)

反转循环的顺序。而不是“对于数组中的每个单词,逐字读取文件并查看文件中的每个单词是否存在匹配”,请检查数组中的每个单词“。

您当前实现的问题是它不会“回放”该文件。一旦在第一个单词上到达文件末尾,就不会返回文件并从头开始。但是,从文件的开头开始比从数组的开头开始要昂贵得多,因此反转循环顺序是最佳解决方案:

while (s.hasNext()) {
    String word = s.next();
    for(int i = 0; i < array.length; i++) {
        if (word.equals(array[i])) {
            wordCount[i]++;
        }
    }
}

答案 1 :(得分:0)

将行Scanner s = new Scanner(fin);移到while ( s.hasNext() )

之上

OR

实际上,当dasblinkenlight告诉你,你应该改变循环的顺序,我也发现你的代码有几个问题: a)PrintWriter应该高于你的listFiles循环 b)fin.close(),s.close().... all应该在main for循环之外。

通过上述更改,代码将类似于:

PrintWriter writer = new PrintWriter(new FileWriter("result.txt",true));  
for(File file: path.listFiles())
 {


        FileInputStream fin =  new FileInputStream(file);
        Scanner s = new Scanner(fin);

        while ( s.hasNext() )
        {
            if ( Arrays.asList( array ).contains( s.next() ) ) 
            {
                 wordCount[i]++;
            }

        }


 }

 writer.println(wordCount[0] + "," + wordCount[1] + "," + wordCount[2] + "," + 
                wordCount[3] + "," + wordCount[4] + "," +wordCount[5] + "," +wordCount[6] 
                + "," +wordCount[7] + "," + wordCount[8] + "," + wordCount[9]);
        fin.close();
        s.close();
        writer.close();

答案 2 :(得分:0)

您已阅读整个文件,计算第一个单词“beautiful”的出现次数。当您开始计算第二个单词时,您的Scanner s已经在文件的末尾,因此它不会返回任何内容。