我在Eclipse中创建了一个Java程序。该程序计算每个单词的频率。例如,如果用户输入“我去了商店”。该程序将产生输出' 1 1 1 2'这是1个长度为1的单词(' I')1个长度为2的单词('到')1个长度为3的单词('')和2个单词长度为4('去'' shop')。
我已创建此程序以读取用户输入的字符串,但我想调整代码以读取文本文件的每一行。任何帮助都会很棒。
import java.util.Scanner;
public class WordLengthFrequency
{
public static void main(String[] args)
{
Scanner scan = new Scanner(System.in);
while (true)
{
System.out.println("Enter text: ");
String s;
s = scan.nextLine();
String input = s;
String strippedInput = input.replaceAll("\\W", " ");
System.out.println("" + strippedInput);
String[] strings = strippedInput.split(" ");
int[] counts = new int[6];
int total = 0;
for (String str : strings)
if (str.length() < counts.length)
counts[str.length()] += 1;
for (String s1 : strings)
total += s1.length();
for (int i = 1; i < counts.length; i++){
StringBuilder sb = new StringBuilder(i).append(i + " letter words: ");
for (int j = 1; j <= counts[i]; j++) {
sb.append('*');
System.out.println(i + " letter words: " + counts[i]);
System.out.println(sb);
System.out.println(("mean lenght: ") + ((double) total / strings.length));
}
}
}
}
}
答案 0 :(得分:0)
Scanner scan = new Scanner(System.in);
此代码创建一个扫描system.in以查找要读取的内容的扫描程序。 System.in通常是控制台。相反,您想要从其他地方读取,因此您需要将扫描仪指向所需的文本。
这可以通过
轻松完成Scanner scan = new Scanner(new File("filePath"));
您还需要更改循环,因为您不能再继续(文件,不像控制台输入,最终结束)。扫描仪有一个很好的小方法,hasNext(),它会告诉你它是否有更多行可供阅读。
答案 1 :(得分:0)
首先,一点代码格式化可以使可读性产生巨大差异。此外,为了阅读文件,我建议使用BufferedReader
。在这种情况下,我建议使用HashMap
。目前,由于您使用的是具有有限索引的列表,因此您将被限制为可以跟踪的单词长度。使用地图,您可以跟踪任何数量的单词长度。像下面这样的东西会很好。
public static void main(String[] args) {
HashMap<Integer, Integer> lengthCount = new HashMap<Integer, Integer>();
BufferedReader br;
try {
String currentLine;
br = new BufferedReader(new FileReader("text.txt"));
// Gets new line, if it is the end of the file, it ends
int totalNumberWords = 0;
while ((currentLine = br.readLine()) != null) {
String[] words = currentLine.split(" ");
totalNumberWords += words.length;
// Iterates through the words in the line and
// increments the map appropriately
for (String word : words) {
int currentNumber = 0;
if (lengthCount.get(word.length()) != null)
currentNumber = lengthCount.get(word.length());
lengthCount.put(word.length(), currentNumber + 1);
}
}
// Iterates through the map and prints the amount of strings
// for each length and the percent of words with each length
for (Map.Entry<Integer, Integer> curEntry : lengthCount.entrySet()) {
double percentWithThisLength = ((double) curEntry.getValue() / totalNumberWords) * 100;
System.out.print(curEntry.getValue() + " string(s) with length " + curEntry.getKey());
System.out.println(" (" + percentWithThisLength + "%)");
}
br.close();
} catch (IOException e) {
System.out.println("Could not find specified file");
}
}
text.txt
包含的内容:
Lorem ipsum dolor sit amet,consectetur adipiscing elit,sed do eiusmod tempor incididunt ut labore et dolore magna aliqua。耶
产生
3 string(s) with length 2 (15.0%)
3 string(s) with length 3 (15.0%)
6 string(s) with length 5 (30.0%)
3 string(s) with length 6 (15.0%)
2 string(s) with length 7 (10.0%)
2 string(s) with length 10 (10.0%)
1 string(s) with length 11 (5.0%)