Question

我是Java的初学者。我只想计算文本文件中每个单词的出现次数。输入格式如下：

A B
A C
C A
B C

这是我到目前为止所做的：

public static void main (String[] args) throws FileNotFoundException
{
    Scanner inputFile = new Scanner(new File("test.txt"));
    while (inputFile.hasNextLine()) {
        String line = inputFile.nextLine();
        System.out.println(line);
        // above is the first part, to read the file in
        // below is the second part, try to count
        Map<String, Integer> counts = new HashMap<>();
        for (String word : line) {
            Integer count = counts.get(word);
            counts.put(word, count == null ? 1 : count + 1);
        }
        System.out.println(counts);
    }
}

预期结果如下：

A 3
B 2
C 3

我在谷歌上获得了第一和第二部分，但不知道如何将它们结合起来。任何建议都会有所帮助。

Answer 1

您无法使用for-each循环遍历String（变量line）。您需要先将其拆分为单词，如下所示：

   String[] words = line.split(" ");
   for(String word : words) {
    // do something
   }

此外，代码中似乎还有一个错误。用于管理计数的Map需要存在于while循环之外，否则计数将是特定行的本地计数。更改代码如下：

public static void main (String[] args) throws FileNotFoundException
{
 Scanner inputFile = new Scanner(new File("test.txt"));
 Map<String, Integer> counts = new HashMap<>();
 while (inputFile.hasNextLine()) {
    String line = inputFile.nextLine();
    System.out.println(line);
    // above is the first part, to read the file in
    // below is the second part, try to count

    String[] words = line.split(" ");
    for (String word : words) {
        Integer count = counts.get(word);
        counts.put(word, count == null ? 1 : count + 1);
    }

  } // end of while

  System.out.println(counts);
}

Answer 2

您需要阅读单词，而不仅仅是行。

由于Scanner中的默认分隔符正确分割每个单词，您可以尝试：

while (inputFile.hasNext()) {
    String word = inputFile.next();
    // do the same as before with word
}

Answer 3

inputFile.nextLine()返回包含当前行的单词的String。你想要做的是将它分成一个字符串数组（你的单词），然后迭代它们。看看String.split()

Answer 4

 Scanner inputFile = new Scanner(new File("C:/Test/test.txt"));
   Map<String, Integer> counts = new HashMap<>();
    while (inputFile.hasNextLine()) {
        String line = inputFile.nextLine();         
        for (String word : line.split(" ")) {
            Integer count = counts.get(word);
            counts.put(word, count == null ? 1 : count + 1);
        }

    }
    System.out.println(counts);

使用JAVA 7 Files API，您可以按照以下方式实现

public static void main(String[] args) throws IOException{
    List<String> allLines = Files.readAllLines(Paths.get("C:/Test/test.txt"), Charset.defaultCharset());
    Map<String,Integer> charCount = new HashMap<String,Integer>();
    for(String line:allLines){
        String[] characters = line.split(" ");
        for(String charac:characters){
            Integer currentCount = charCount.get(charac);
            charCount.put(charac, currentCount == null ? 1 : currentCount + 1); 
        }
    }
    System.out.println(charCount);
}

Answer 5

这会奏效。请注意，扫描仪会将每个单词与每行相对应。

public static void main (String[] args) throws FileNotFoundException 
{
    Scanner scanner = new Scanner("A B C D A A B C C");
    Map<String, Integer> words = new HashMap<>();
    String word;

    // Loop through each word instead of each line
    while (scanner.hasNext()) {
        word = scanner.next();

        // If the HashMap already contains the key, increment the value
        if (words.containsKey(word)){
            words.put(word, words.get(word) + 1);
        }
        // Otherwise, set the value to 1
        else {
            words.put(word, 1);
        }        
    }

    // Loop through the HashMap and print the results
    for(Entry<String, Integer> entry : words.entrySet()) {
        String key = entry.getKey();
        Integer value = entry.getValue();

        System.out.println(key + ": " + value);
    }
}

Answer 6

您可以使用StringTokenizer获取单个单词。它可以在令牌方面划分单词，并且它有助于处理字符串的许多功能。

String msg = "http://192.173.15.36:8084/";
    StringTokenizer st = new StringTokenizer(msg, "://.");

我们还可以通过StringTokenizer使用正则表达式获得不同种类的字符串标记。

完整的解决方案是从文件中获取计数字数。

public static void main(String[] args) {

    Scanner inputFile;
       Map<String, Integer> words = new HashMap<String, Integer>();
    try {
        inputFile = new Scanner(new File("d:\\test.txt"));
        while (inputFile.hasNextLine()) {
            //SringTokenize is automatically divide the string with space.

             StringTokenizer tokenizer = new StringTokenizer(inputFile.nextLine());
                while (tokenizer.hasMoreTokens()) {
                    String word=tokenizer.nextToken();
                    // If the HashMap already contains the key, increment the value
                    if (words.containsKey(word)){
                        words.put(word, words.get(word) + 1);
                    }
                    // Otherwise, set the value to 1
                    else {
                        words.put(word, 1);
                    } 
                }

        }

    } catch (FileNotFoundException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }

     // Loop through the HashMap and print the results
    for(Entry<String, Integer> entry : words.entrySet()) {
        String key = entry.getKey();
        Integer value = entry.getValue();

        System.out.println(key + ": " + value);
    }
}

required：array或java.lang.Iterable

6 个答案: