我有一些代码可以计算给定ArrayList中的单词频率。我有一个频率等级,它基本上存储了该字及其各自的频率。这是我的代码:
public static List<Frequency> computeWordFrequencies(List<String> words) {
List<String> wordsList = words;
String[] wordsArray = wordsList.toArray(new String[0]);
Arrays.sort(wordsArray);
Set<String> noDuplicates = new LinkedHashSet<>(Arrays.asList(wordsArray));
List<Frequency> frequencies = new ArrayList<>();
for (String word : noDuplicates) {
int wordFrequency = Collections.frequency(words, word);
Frequency newFrequency = new Frequency(word, wordFrequency);
System.out.println(newFrequency.toString());
frequencies.add(newFrequency);
}
for (Frequency f : frequencies) {
System.out.println(f.getText()+" "+f.getFrequency());
}
return frequencies;
}
供参考,频率等级:
public class Frequency {
private final String word;
private static int frequency;
public Frequency(String word) {
this.word = word;
frequency = 0;
}
public Frequency(String word, int newfrequency) {
this.word = word;
this.frequency = newfrequency;
}
public String getText() {
return word;
}
public int getFrequency() {
return frequency;
}
public static void setFrequency(int newFrequency) {
frequency = newFrequency;
}
public void incrementFrequency() {
frequency++;
}
@Override
public String toString() {
return word + ":" + frequency;
}
}
我在我的代码中插入了print语句,这是一些输出:
包装:1码:3纱:2年:2尚未:1育空:1零:2弃弃
2手风琴2 acequia 2横跨
2加2
因此,当创建频率时,它们具有正确的频率,但不知何故它们后来都变为2.更奇怪的是,如果我将第二个print语句更改为f.toString(),即使是第一次打印声明只显示2这样的频率:
升起:2条河流:2条河流:2条河流:2条道路:2条行走者:2条 路边:2烤:2
有谁可以告诉我为什么所有的频率都设置为2,或问题出在哪里?
答案 0 :(得分:1)
从
中删除static
private static int frequency
您希望每个单词都有一个单独的实例变量,而不是所有单词的一个类变量。
另外,我建议使用Hashmap<String, Integer>
作为频率计数器,而不是创建任何包装器对象,因为......
Collections.frequency
具有O(n ^ 2)运行时