Question

我正在编写一个程序来读取文件并计算该文件中特定单词的出现次数。

我已经让代码工作到了一定程度。我把我想要的字放在String []中。问题是程序要么计算文件中所有单词的出现次数（包括我不想计算的单词），要么计算字符串[]中的单词。

如何让程序计算文件中与数组中的单词匹配的单词？我查看了许多类似的问题并尝试过使用StringTokenizer和Lists，但也无法让它们完全正常工作。

我的目标是，如果我的文件有文字＆＃34;黄色红色蓝色白色黑色紫色蓝色＆＃34;，我希望我的输出为＆＃34;红色：1，蓝色：2，黄色：1＆＃34;

我只是想要朝着正确的方向轻推，我知道这是一件让我感到愚蠢的事情，并且一如既往，任何有建设性的反馈都会受到赞赏。

到目前为止，这是我的代码：

static String[] words = { "red", "blue", "yellow", "green" };

public static void main(String[] args) throws FileNotFoundException, IOException {

    System.out.println("This program will count the occurences of the specific words from a text file.");

    System.out.println("\nThe words to be counted are; red, blue, yellow, and green.\n");

    Map map = new HashMap();

    try (BufferedReader br = new BufferedReader(new FileReader("colours.txt"))) {

        StringBuilder sb = new StringBuilder();

        String line = br.readLine();

        while (line != null) {

            words = line.split(" "); // keeping this counts all words separated by whitespace, removing it counts words in my array instead of the file, so I'll get red: 1, blue: 1, yellow: 1 etc.,

            for (int i = 0; i < words.length; i++) {

                if (map.get(words[i]) == null) {

                    map.put(words[i], 1);
                }

                else {

                    int newValue = Integer.valueOf(String.valueOf(map.get(words[i])));

                    newValue++;

                    map.put(words[i], newValue);
                }

            }

            sb.append(System.lineSeparator());

            line = br.readLine();
        }
    }

    Map<String, String> sorted = new TreeMap<String, String>(map);

    for (Object key : sorted.keySet()) {

        System.out.println(key + ": " + map.get(key));
    }
}

Answer 1

上面的主要问题是，当您拆分刚读过的行时，您将覆盖初始数组或words。

我写过这个（为了我自己的理解，修改了变量名称）

（根据评论更新，感谢@shmosel）

public static void main(String[] args) throws FileNotFoundException, IOException {

    String[] keywords = {"red", "blue", "yellow", "green"};
    // for easier querying contents of array
    List keywordList = Arrays.asList(keywords);

    System.out.println("This program will count the occurrences of the specific words from a text file.");
    System.out.println("\nThe words to be counted are: " + keywordList + ".\n");

    Map<String, Integer> wordMap = new HashMap<>();

    try (BufferedReader br = new BufferedReader(new FileReader("/path/to/file/colours.txt"))) {
        // read a line
        String line = br.readLine();

        while (line != null) {
            // keeping this counts all words separated by whitespace, removing it counts words in my array instead
            // of the file, so I'll get red: 1, blue: 1, yellow: 1 etc.,
            String[] words = line.split(" ");

            for(String oneWord : words ){
                if( keywordList.contains(oneWord)){
                    // thanks @ shmosel for the improvement suggested in comments
                    wordMap.merge(oneWord, 1, Integer::sum);
                }
            }

            line = br.readLine();
        }
    }

    Map<String, Integer> sorted = new TreeMap<>(wordMap);

    for (Object key : sorted.keySet()) {
        System.out.println(key + ": " + wordMap.get(key));
    }
}

Answer 2

代码中可能存在两个问题。

数组'words'最初用于列出您感兴趣的单词。但是你使用相同的数组来保存行中的单词。 [请参阅 words = line.split（“”）; ]因此，请使用不同的数组来保存行中的单词。
不检查单词（在初始列表中）是否存在线。需要添加此检查。另外，请记住，一个单词可以在同一行重复多次。

计算文件中与String []中的单词匹配的单词

2 个答案: