Question

我试图了解如何将文件的每个单词放入HashSet。我正在编写的方法应该读取文件并将文件中找到的单词作为HashSet返回。我还必须使用方法split（）但无法弄清楚如何使用它。我还有一个normalize（）方法，将所有单词转换为小写。这是我得到了多远：

public static HashSet<String> extractWordsFromDocument(String filename) {
    try {
       FileReader in = new FileReader(filename);
      Scanner file = new Scanner(in);
      while(file.hasNext( )){
        try {
          String line = file.nextLine();
          line = line.normalize();
          line = line.split();
          Set<String> words = new HashSet<String>();
          hashset.add(line);
          System.out.println(words);
        }
        catch (Exception e) {
        }
      }
    }
     catch (FileNotFoundException e) {
       System.out.println("Working Directory = " + System.getProperty("user.dir"));
    }
    return null;
  }

我知道这段代码中有很多错误。我只是一个初学者...

Answer 1

您正在循环中创建HashSet，这意味着您将为文件中的每一行添加一个新行，并且每个行仅包含该行中的单词。

此外，您可以更好地使用Scanner，它有next()方法，可以为您提供由空格分隔的单词（空格，制表符，行结尾等），是默认分隔符。

请记得关闭资源。从Java 7开始，您可以使用try-with-resources statement。

此外，don't swallow your exceptions。

public static Set<String> extractWordsFromDocument(String filename) throws IOException {
    try (Reader in = new FileReader(filename)) {
        Set<String> words = new HashSet<>();
        Scanner scanner = new Scanner(in);
        while (scanner.hasNext()){
            words.add(scanner.next());
        }
        return words;
    }
}

如果您想了解String或split()的工作原理，read the docs ......

在Java中将文档中的单词提取到HashSet中

1 个答案: