我创建了一个控制台应用程序,旨在显示文本文件中最常用的单词。请查看下面的代码:
public class Main
{
public static void main(String[] args)
{
readTextFile();
}
private static void readTextFile()
{
final String path = C://Users//Geffrey//IdeaProjects//test.txt";
File file = new File(path);
BufferedReader bufferedReader = null;
try
{
bufferedReader = new BufferedReader(new FileReader(file));
} catch (FileNotFoundException e)
{
e.printStackTrace();
}
String inputLine = null;
Map<String, Integer> wordMap = new HashMap<>();
try
{
while ((inputLine = bufferedReader.readLine()) != null)
{
String[] words = inputLine.split("[.,;:!?(){}— \\s]"); //
for (int count = 0; count < words.length; count++)
{
String key = words[count].toLowerCase(); // remove .toLowerCase for Case Sensitive result.
if (key.length() > 0)
{
if (wordMap.get(key) == null)
{
wordMap.put(key, 1);
} else
{
int value = wordMap.get(key).intValue();
value++;
wordMap.put(key, value);
}
}
}
}
List<WordComparable> topOccurrence = findMaxOccurance(wordMap, 1);
System.out.println("Most Frequent word: " + topOccurrence.get(0).wordFromFile + " occurred " + topOccurrence.get(0).numberOfOccurrence + " times"); //Maixmum Occurance of Word in file:
} catch (IOException error)
{
System.out.println("Invalid File");
} finally
{
try
{
bufferedReader.close();
} catch (IOException e)
{
e.printStackTrace();
}
}
}
public static List<WordComparable> findMaxOccurance(Map<String, Integer> map, int n)
{
List<WordComparable> list = new ArrayList<>();
for (Map.Entry<String, Integer> entry : map.entrySet())
list.add(new WordComparable(entry.getKey(), entry.getValue()));
Collections.sort(list);
return list;
}
WordComparable类:
public class WordComparable implements Comparable<WordComparable>
{
public String wordFromFile;
public int numberOfOccurrence;
public WordComparable(String wordFromFile, int numberOfOccurrence)
{
super();
this.wordFromFile = wordFromFile;
this.numberOfOccurrence = numberOfOccurrence;
}
@Override
public int compareTo(WordComparable arg0)
{
int wordCompare = Integer.compare(arg0.numberOfOccurrence, this.numberOfOccurrence);
return wordCompare != 0 ? wordCompare : wordFromFile.compareTo(arg0.wordFromFile);
}
@Override
public int hashCode()
{
final int uniqueNumber = 19;
int wordResult = 9;
wordResult = uniqueNumber * wordResult + numberOfOccurrence;
wordResult = uniqueNumber * wordResult + ((wordFromFile == null) ? 0 : wordFromFile.hashCode());
return wordResult;
}
@Override
public boolean equals(Object obj)
{
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
WordComparable other = (WordComparable) obj;
if (numberOfOccurrence != other.numberOfOccurrence)
return false;
if (wordFromFile == null)
{
if (other.wordFromFile != null)
return false;
} else if (!wordFromFile.equals(other.wordFromFile))
return false;
return true;
}
}
我的问题是我的解决方案是解决此问题的最有效方法,如果不能解决,我还可以进行哪些其他更改来改进代码。
答案 0 :(得分:0)
您可以考虑创建最大优先级队列(经典的最大堆)而不是HashMap。
如果使用maxheap,则可以找到O(1)时间中最常见的单词。