Question

我正在尝试将文件的单词读入流中，并计算单词“the”出现在文件中的次数。我似乎无法找到一种只使用流来实现此目的的有效方法。

示例：如果文件中包含如下句子：“男孩跳过河流。”输出将是2

这是我到目前为止所尝试的

public static void main(String[] args){

    String filename = "input1";
    try (Stream<String> words = Files.lines(Paths.get(filename))){
        long count = words.filter( w -> w.equalsIgnoreCase("the"))
                .count();
        System.out.println(count);
    } catch (IOException e){

    }
}

Answer 1

您可以将Java StreamTokenizer用于此目的。

import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.StreamTokenizer;
import java.io.InputStreamReader;
import java.nio.charset.StandardCharsets;

public class Main {

    public static void main(String[] args) throws IOException {
      long theWordCount = 0;
      String input = "The boy jumped over the river.";
      try (InputStream stream = new ByteArrayInputStream(
             input.getBytes(StandardCharsets.UTF_8.name()))) {
        StreamTokenizer tokenizer = 
          new StreamTokenizer(new InputStreamReader(stream));
            int tokenType = 0;
            while ( (tokenType = tokenizer.nextToken()) 
               != StreamTokenizer.TT_EOF) {
                if (tokenType == StreamTokenizer.TT_WORD) {
                    String word = tokenizer.sval;
                    if ("the".equalsIgnoreCase(word)) {
                        theWordCount++;
                    }
                }
            }
        }
        System.out.println("The word 'the' count is: " + theWordCount);
    }
}

Answer 2

只是行名称建议Files.lines返回行的流而不是单词。如果你想迭代单词，我可以使用Scanner之类的

Scanner sc = new Scanner(new File(fileLocation));
while(sc.hasNext()){
    String word = sc.next();
    //handle word
}

如果您真的想使用流，可以拆分每一行，然后将您的流映射到这些单词

try (Stream<String> lines = Files.lines(Paths.get(filename))){
    long count = lines
            .flatMap(line->Arrays.stream(line.split("\\s+"))) //add this
            .filter( w -> w.equalsIgnoreCase("the"))
            .count();
    System.out.println(count);
} catch (IOException e){
    e.printStackTrace();//at least print exception so you would know what wend wrong
}

顺便说一句，你不应该留下空的拦截块，至少是抛出异常，这样就可以获得更多有关问题的信息。

Answer 3

使用流阅读器计算单词数。

使用流从文件中读取单词

3 个答案: