我已经实现了代码来计算数量: - 字符 - 单词 - 线条 - 字节 在文本文件中。 但如何计算字典大小:此文件中使用的不同单词的数量? 另外,如何实现只能迭代字母的迭代器? (忽略空格)
public class wc {
public static void main(String[] args) throws IOException {
//counters
int charsCount = 0;
int wordsCount = 0;
int linesCount = 0;
Scanner in = null;
File file = new File("Sample.txt");
try(Scanner scanner = new Scanner(new BufferedReader(new FileReader(file)))){
while (scanner.hasNextLine()) {
String tmpStr = scanner.nextLine();
if (!tmpStr.equalsIgnoreCase("")) {
String replaceAll = tmpStr.replaceAll("\\s+", "");
charsCount += replaceAll.length();
wordsCount += tmpStr.split("\\s+").length;
}
++linesCount;
}
System.out.println("# of chars: " + charsCount);
System.out.println("# of words: " + wordsCount);
System.out.println("# of lines: " + linesCount);
System.out.println("# of bytes: " + file.length());
}
}
}
答案 0 :(得分:0)
获得独特的单词及其数量:
1.将获得的行从文件拆分为字符串数组
2.将此字符串数组的内容存储在Hashset中
3.重复步骤1和2直到文件结束
4.从Hashset中获取独特的单词和计数
我更喜欢发布逻辑和伪代码,因为它将帮助OP通过解决已发布的问题来学习一些东西。
答案 1 :(得分:-2)
public class CountUniqueWords {
public static void main(String args[]) throws FileNotFoundException {
File f = new File("File Name");
ArrayList arr=new ArrayList();
HashMap<String, Integer> listOfWords = new HashMap<String, Integer>();
Scanner in = new Scanner(f);
int i=0;
while(in.hasNext())
{
String s=in.next();
//System.out.println(s);
arr.add(s);
}
Iterator itr=arr.iterator();
while(itr.hasNext())
{i++;
listOfWords.put((String) itr.next(), i);
//System.out.println(listOfWords); //for Printing the words
}
Set<Object> uniqueValues = new HashSet<Object>(listOfWords.values());
System.out.println("The number of unique words: "+uniqueValues.size());
}
}