我有以下代码,它计算并显示每个单词在整个文本文档中出现的次数。
try {
List<String> list = new ArrayList<String>();
int totalWords = 0;
int uniqueWords = 0;
File fr = new File("filename.txt");
Scanner sc = new Scanner(fr);
while (sc.hasNext()) {
String words = sc.next();
String[] space = words.split(" ");
for (int i = 0; i < space.length; i++) {
list.add(space[i]);
}
totalWords++;
}
System.out.println("Words with their frequency..");
Set<String> uniqueSet = new HashSet<String>(list);
for (String word : uniqueSet) {
System.out.println(word + ": " + Collections.frequency(list,word));
}
} catch (Exception e) {
System.out.println("File not found");
}
是否可以修改此代码以使其只对每行而不是整个文档计算一次?
答案 0 :(得分:1)
可以读取每行的内容,然后每行应用逻辑来计算单词:
File fr = new File("filename.txt");
FileReader fileReader = new FileReader(file);
BufferedReader br = new BufferedReader(fileReader);
// Read the line in the file
String line = null;
while ((line = br.readLine()) != null) {
//Code to count the occurrences of the words
}
答案 1 :(得分:0)
是。 Set数据结构与ArrayList非常相似,但关键区别在于没有重复。 所以,只需使用一套。 在你的while循环中:
while (sc.hasNext()) {
String words = sc.next();
String[] space = words.split(" ");
//convert space arraylist -> set
Set<String> set = new HashSet<String>(Arrays.asList(space));
for (int i = 0; i < set.length; i++) {
list.add(set[i]);
}
totalWords++;
}
其余代码应保持不变。