Question

我在Eclipse中创建了一个Java程序。该程序计算每个单词的频率。例如，如果用户输入“我去了商店”。该程序将产生输出＆＃39; 1 1 1 2＆＃39;这是1个长度为1的单词（＆＃39; I＆＃39;）1个长度为2的单词（＆＃39;到＆＃39;）1个长度为3的单词（＆＃39;＆＃39;）和2个单词长度为4（＆＃39;去＆＃39;＆＃39; shop＆＃39;）。

我已创建此程序以读取用户输入的字符串，但我想调整代码以读取文本文件的每一行。任何帮助都会很棒。

import java.util.Scanner;

public class WordLengthFrequency
{

    public static void main(String[] args)
    {
        Scanner scan = new Scanner(System.in);

        while (true)
        {
            System.out.println("Enter text: ");

            String s;
            s = scan.nextLine();
            String input = s;
            String strippedInput = input.replaceAll("\\W", " ");

            System.out.println("" + strippedInput);

            String[] strings = strippedInput.split(" ");
            int[] counts = new int[6];
            int total = 0;
            for (String str : strings)
                if (str.length() < counts.length)
                    counts[str.length()] += 1;
            for (String s1 : strings)
                total += s1.length();   
            for (int i = 1; i < counts.length; i++){    
                StringBuilder sb = new StringBuilder(i).append(i + " letter words: ");
                for (int j = 1; j <= counts[i]; j++) {
                    sb.append('*');
                    System.out.println(i + " letter words: " + counts[i]);
                    System.out.println(sb);
                    System.out.println(("mean lenght: ") + ((double) total / strings.length));
                }
            }
       }
    }
}

Answer 1

Scanner scan = new Scanner(System.in);

此代码创建一个扫描system.in以查找要读取的内容的扫描程序。 System.in通常是控制台。相反，您想要从其他地方读取，因此您需要将扫描仪指向所需的文本。

这可以通过

轻松完成

Scanner scan = new Scanner(new File("filePath"));

您还需要更改循环，因为您不能再继续（文件，不像控制台输入，最终结束）。扫描仪有一个很好的小方法，hasNext（），它会告诉你它是否有更多行可供阅读。

Answer 2

首先，一点代码格式化可以使可读性产生巨大差异。此外，为了阅读文件，我建议使用BufferedReader。在这种情况下，我建议使用HashMap。目前，由于您使用的是具有有限索引的列表，因此您将被限制为可以跟踪的单词长度。使用地图，您可以跟踪任何数量的单词长度。像下面这样的东西会很好。

public static void main(String[] args) {
    HashMap<Integer, Integer> lengthCount = new HashMap<Integer, Integer>();
    BufferedReader br;
    try {
        String currentLine;
        br = new BufferedReader(new FileReader("text.txt"));

        // Gets new line, if it is the end of the file, it ends
        int totalNumberWords = 0;
        while ((currentLine = br.readLine()) != null) {
            String[] words = currentLine.split(" ");
            totalNumberWords += words.length;

            // Iterates through the words in the line and
            // increments the map appropriately
            for (String word : words) {
                int currentNumber = 0;
                if (lengthCount.get(word.length()) != null)
                    currentNumber = lengthCount.get(word.length());
                lengthCount.put(word.length(), currentNumber + 1);
            }
        }

        // Iterates through the map and prints the amount of strings
        // for each length and the percent of words with each length
        for (Map.Entry<Integer, Integer> curEntry : lengthCount.entrySet()) {
            double percentWithThisLength = ((double) curEntry.getValue() / totalNumberWords) * 100;
            System.out.print(curEntry.getValue() + " string(s) with length " + curEntry.getKey());
            System.out.println(" (" + percentWithThisLength + "%)");
        }

        br.close();
    } catch (IOException e) {
        System.out.println("Could not find specified file");
    }
}

text.txt包含的内容：

Lorem ipsum dolor sit amet，consectetur adipiscing elit，sed do eiusmod tempor incididunt ut labore et dolore magna aliqua。耶

产生

3 string(s) with length 2 (15.0%)
3 string(s) with length 3 (15.0%)
6 string(s) with length 5 (30.0%)
3 string(s) with length 6 (15.0%)
2 string(s) with length 7 (10.0%)
2 string(s) with length 10 (10.0%)
1 string(s) with length 11 (5.0%)

字长度频率

2 个答案: