如何计算Hashmap中的单词

时间:2017-04-20 02:17:18

标签: java

我已经拼凑了这些代码,当然还有一些帮助。它的目的是获取用户消息并检查文本文件,以查看消息中是否也找到了文本文件中的任何单词(大约50个常用英语单词)。我的问题是我还想计算消息中每个单词的出现次数。我试图添加一个int计数器,但我收到错误,因为我不能计算一个整数的字符串对象。如果我只是添加一个计数器:

for (String userInput : message.split(" ")) 
{
    if (legalEnglishWords.contains(userInput)) 
    {
        System.out.println(userInput + " is an English word ");
        count++;
        System.out.println(userInput + " occurs " + counter + " times");
    }
}

我得到了一个通过法律单词测试的每个单词的迭代,让我的输出看起来像

Please enter in a message: 
transformers are is is this

transformers occurs 0 times
are is an English word 
are occurs 1 times
is is an English word 
is occurs 2 times
is is an English word 
is occurs 3 times
this is an English word 
this occurs 4 times

正如您所看到的那样,计数是完全错误的,我会在消息中显示一个单词。我只希望它显示一次。我想要放入&&以上if语句检查这个单词是否已被打印,但我不知道该放什么。我也不知道如何只计算if语句中的单个单词,而不是每次输入if语句时计数。所以基本上我需要帮助计算单个单词,同时只显示"英语"单词一次,与最后发生的次数一起。我对java仍然有点新,并且了解我的程序中发生的所有事情以及原因。因此,如果可能的话,我想尝试只是增加这些代码来做必要的事情,而不是对我可能不理解的事情进行全面改革。非常感谢你们!

import java.io.*;
import java.util.HashSet;
import java.util.Scanner;

public class Affine_English3 
{       
        public static void main(String[] args) throws IOException
        {
              HashSet<String> legalEnglishWords = new HashSet<String>();
              Scanner file = new Scanner(new File("example.txt"));
              int counter = 0;

              while (file.hasNextLine())
              {
                  String line = file.nextLine();

                for (String word : line.split(" "))
                {
                    {
                        legalEnglishWords.add(word);
                    }
                }
              } 

                file.close();

                Scanner scan = new Scanner(System.in);
                System.out.println("Please enter in a message: ");
                String message = scan.nextLine();
                scan.close();

                for (String userInput : message.split(" ")) 
                {
                    if (legalEnglishWords.contains(userInput)) 
                    {
                        System.out.println(userInput + " is an English word ");
                        counter++;
                    }
                    System.out.println(userInput + " occurs " + counter + " times");
                }
        }
}   

这是我的#34; Common English Word&#34;文本文件

the he at but there  of was be not use and for can 
a on have all each to are from where which in by
is with line when do you his had your how that they
it i word said if this what an as or we she their

2 个答案:

答案 0 :(得分:4)

您在正确的道路上,但您需要更改存储数据的集合。如果你想对每个单词进行计数,那么在java中这需要Map。地图将一组键映射到一组值 - 例如将名称映射到年龄。在您的情况下,您希望将单词映射到计数。

首先从声明如下的地图开始:

Map<String, Integer> wordCounts = new HashMap<>();

然后每当你遇到一个单词时,你可以做类似的事情:

if (!wordCounts.containsKey(word))
    wordCounts.put(word, 1);
else
    wordCounts.put(word, wordCounts.get(word) + 1);

如果您使用的是Java 8,则可以(可以说)更优雅的语法:

wordCounts.compute(word, n -> n == null ? 1 : n + 1);

处理完所有文本后,您可以打印出计数:

for (String word: wordCounts.keySet()) {
    System.out.println("Word " + word + " occurred " + wordCounts.get(word) + " times";
}

答案 1 :(得分:1)

您要做的是跟踪每个合法英语单词的数量(在example.txt中定义)。因此,您需要Map而不是Set

在伪代码中,它看起来像:

Map<String, Integer> legalEnglishWord = new HashMap<>();
for each word in "exmample.txt" {
    legalEnglishWords.put(word, 0);
}

// handle input
for each word in inputMessage {
    if (legalEnglishWords.containsKey(word)) { 
        legalEnglishWords.put(word, legalEnglishWord.get(word) + 1);
    }
    // or simply
    legalEnglishWords.computeIfPresent( word, count -> return count+1);
}

// At this point, you have each legal words, and the appearances of 
// each legal word in the input message.