我正在努力使代码更加整洁有效。我试图实现zamzela的[你会发现其中一个答案]方法。我无法实现比较器
公共类WordCountExample {
public static void main(String[] args) throws IOException {
Set<WordCount> wordcount = new HashSet<WordCount>();
File file = new File("c:\\test\\input1.txt"); //path to the file
String str = FileUtils.readFileToString(file); // converts a file into a string
String[] words = str.split("\\s+"); // split the line on whitespace,
// would return an array of words
for (String s : words) {
wordcount.add(new WordCount(s));
WordCount.incCount();
}
/*here WordCount is the name of comparator class*/
Collections.sort(wordcount,new WordCount()); //getting a error here
for (WordCount w : wordcount) {
System.out.println(w.getValue() + " " + w.getCount());
}
}
}
答案 0 :(得分:3)
不要将字数统计为地图中的值。存储包含单词及其出现次数的对象。
public class `WordWithOccurrences` {
private final String word;
private int occurrences;
// ...
}
因此,您的地图应为Map<String, WordWithOccurrences>
。
然后根据其出现属性对值列表进行排序,并迭代最后10个值以显示其word属性(或按相反顺序排序并显示前十个值)。
您必须使用自定义比较器对WordWithOccurrences
个实例进行排序。
答案 1 :(得分:2)
我认为最好的方法是制作一个Word Word
public class Word implements Comparable<Word>{
private String value;
private Integer count;
public Word(String value) {
this.value = value;
count = 1;
}
public String getValue() {
return value;
}
public Integer getCount() {
return count;
}
public void incCount() {
count++;
}
@Override
public boolean equals(Object obj) {
if (obj instanceof Word)
return value.equals(((Word) obj).getValue());
else
return false;
}
@Override
public int hashCode() {
return value.hashCode();
}
@Override
public int compareTo(Word o) {
return count.compareTo(o.getCount());
}
}
你可以使用HashSet因为bean将保存在bean中,在你填充完所有内容之后你可以对它进行排序Collections.sort(array);并采取前10个元素。
答案 2 :(得分:1)
终于解决了这个计划。这是一个完美的工作程序,它读取一个文件,计算单词的数量,并按降序列出前10个最常出现的单词
import java.io. ; import java.util。;
public class Occurance {
public static void main(String[] args) throws IOException {
LinkedHashMap<String, Integer> wordcount =
new LinkedHashMap<String, Integer>();
try {
BufferedReader in = new BufferedReader(
new FileReader("c:\\test\\input1.txt"));
String str;
while ((str = in.readLine()) != null) {
str = str.toLowerCase(); // convert to lower case
String[] words = str.split("\\s+"); //split the line on whitespace, would return an array of words
for( String word : words ) {
if( word.length() == 0 ) {
continue;
}
Integer occurences = wordcount.get(word);
if( occurences == null) {
occurences = 1;
} else {
occurences++;
}
wordcount.put(word, occurences);
}
}
}
catch(Exception e){
System.out.println(e);
}
ArrayList<Integer> values = new ArrayList<Integer>();
values.addAll(wordcount.values());
Collections.sort(values, Collections.reverseOrder());
int last_i = -1;
for (Integer i : values.subList(0, 9)) {
if (last_i == i) // without duplicates
continue;
last_i = i;
for (String s : wordcount.keySet()) {
if (wordcount.get(s) == i) // which have this value
System.out.println(s+ " " + i);
}
}
}
答案 3 :(得分:0)
假设你的程序实际上没有工作,这里有一个提示:
你自己在每个角色的基础上进行比较,没有经过那些代码,我打赌是错的:
int idx1 = -1;
for (int i = 0; i < str.length(); i++) {
if ((!Character.isLetter(str.charAt(i))) || (i + 1 == str.length())) {
if (i - idx1 > 1) {
if (Character.isLetter(str.charAt(i)))
i++;
String word = str.substring(idx1 + 1, i);
if (wordcount.containsKey(word)) {
wordcount.put(word, wordcount.get(word) + 1);
} else {
wordcount.put(word, 1);
}
}
idx1 = i;
}
}
尝试使用Java的内置功能:
String[] words = str.split("\\s+"); //split the line on whitespace, would return an array of words
for( String word : words ) {
if( word.length() == 0 ) {
continue; //for empty lines, split would return at least one element which is ""; so account for that
}
Integer occurences = wordcount.get(word);
if( occurences == null) {
occurences = 1;
} else {
occurences++;
}
wordcount.put(word, occurences);
}
答案 4 :(得分:0)
我会看看java.util.Comparator
。您可以定义自己的比较器,您可以将其传递给Collections.sort()
。在您的情况下,您可以按其计数对wordcount
的键进行排序。最后,只需获取已排序集合的前十项。
如果您的wordcount
地图的项目太多,您可能需要更高效的内容。可以在线性时间内完成此操作,方法是保持一个大小为10的有序数组,插入每个键,始终丢弃具有最低计数的键。