Question

我有以下代码，它计算并显示每个单词在整个文本文档中出现的次数。

try {
    List<String> list = new ArrayList<String>();
    int totalWords = 0;
    int uniqueWords = 0;
    File fr = new File("filename.txt");
    Scanner sc = new Scanner(fr);
    while (sc.hasNext()) {
        String words = sc.next();
        String[] space = words.split(" ");
        for (int i = 0; i < space.length; i++) {
            list.add(space[i]);
        }
        totalWords++;
    }
    System.out.println("Words with their frequency..");
    Set<String> uniqueSet = new HashSet<String>(list);
    for (String word : uniqueSet) {
        System.out.println(word + ": " + Collections.frequency(list,word));
    }
} catch (Exception e) {

    System.out.println("File not found");

}

是否可以修改此代码以使其只对每行而不是整个文档计算一次？

Answer 1

可以读取每行的内容，然后每行应用逻辑来计算单词：

   File fr = new File("filename.txt");
   FileReader fileReader = new FileReader(file);
   BufferedReader br = new BufferedReader(fileReader);

       // Read the line in the file 
       String line = null;
        while ((line = br.readLine()) != null) {
              //Code to count the occurrences of the words

        }

Answer 2

是。 Set数据结构与ArrayList非常相似，但关键区别在于没有重复。所以，只需使用一套。在你的while循环中：

while (sc.hasNext()) {
                String words = sc.next();
                String[] space = words.split(" ");
                //convert space arraylist -> set
                Set<String> set = new HashSet<String>(Arrays.asList(space));
                for (int i = 0; i < set.length; i++) {
                    list.add(set[i]);
                }
                totalWords++;
            }

其余代码应保持不变。

在Java中转换ArrayLists

2 个答案: