Question

我尝试执行的任务首先计算目录中的文件数，然后在每个文件中提供字数。我得到的文件数量还不错，但我很难转换一些代码，我的导师给我从一个频率计数到更简单字数的类。此外，我似乎无法找到正确的代码来查看每个文件以计算单词（我试图找到某些东西＆＃34;泛型＆＃34;而不是特定的，但我试图使用特定的文本文件测试程序）。这是预期的输出：

Angular 4.4

但是，这就是输出的内容：

Count 11 files:
word length: 1 ==> 80
word length: 2 ==> 321
word length: 3 ==> 643

我使用了两个类：WordCount和FileCatch8

字计数：

primes.txt
but
are
sometimes
sense
refrigerator
make
haiku
dont
they
funny
word length: 1 ==> {but=1, are=1, sometimes=1, sense=1, refrigerator=1, make=1, haiku=1, dont=1, they=1, funny=1}

.....

Count 11 files:

和FileCatch：

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.AbstractMap.SimpleEntry;
import java.util.Arrays;
import java.util.Map;
import static java.util.stream.Collectors.counting;
import static java.util.stream.Collectors.groupingBy;

    /**
     *
     * @author 
     */
    public class WordCount {

        /**
         *
         * @param filename
         * @return
         * @throws java.io.IOException
         */
        public Map<String, Long> count(String filename) throws IOException {
            //Stream<String> lines = Files.lines(Paths.get(filename));
            Path path = Paths.get("haiku.txt");
            Map<String, Long> wordMap = Files.lines(path)
                    .parallel()
                    .flatMap(line -> Arrays.stream(line.trim().split(" ")))
                    .map(word -> word.replaceAll("[^a-zA-Z]", "").toLowerCase().trim())
                    .filter(word -> word.length() > 0)
                    .map(word -> new SimpleEntry<>(word, 1))
                    //.collect(Collectors.toMap(s -> s, s -> 1, Integer::sum));
                    .collect(groupingBy(SimpleEntry::getKey, counting()));

            wordMap.forEach((k, v) -> System.out.println(String.format(k,v)));
            return wordMap;
        }
    }

该程序使用带有lambda语法的Java 8流

Answer 1

字数统计示例：

Files.lines(Paths.get(file))
    .flatMap(line -> Arrays.stream(line.trim().split(" ")))
    .map(word -> word.replaceAll("[^a-zA-Z]", "").toLowerCase().trim())
    .filter(word -> !word.isEmpty())
    .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));

文件数：

Files.walk(Paths.get(file), Integer.MAX_VALUE).count();
Files.walk(Paths.get(file)).count();

Answer 2

在我看来，使用Java 8计算文件中单词的最简单方法是：

Long wordsCount = Files.lines(Paths.get(file))
    .flatMap(str->Stream.of(str.split("[ ,.!?\r\n]")))
    .filter(s->s.length()>0).count();
System.out.println(wordsCount);

并统计所有文件：

Long filesCount = Files.walk(Paths.get(file)).count();
System.out.println(filesCount);

如何计算文本文件中的单词，java 8样式

2 个答案: