如何计算文件和计算一组文件中的单词使用Java 8和Streams

时间:2017-12-09 20:50:12

标签: java file lambda java-8

我很难过。我正在尝试完成一个分配,其中访问文件目录和文件计数,然后读取文件本身并在每个文件中计算单词。这是我发布的问题的延续,但“答案”根本无法解决我的问题(How to count words in a text file, java 8-style

这是问题大纲:

编写一个程序,使用流来有效地计算出一组文件(files.zip)中出现的不同长度的单词。您的输出如下所示:(计数数字仅用于说明目的)。

Count 11 files:
word length: 1 ==> 80
word length: 2 ==> 321
word length: 3 ==> 643

Instead, I got the following output:

primes.txt
Count: 1 files

这是我写的代码。我使用了两个类FileReader,它是读取名为“Files:

的目录的主类
FileReader.java

    import java.io.IOException;
    import java.nio.file.DirectoryStream;
    import java.nio.file.Files;
    import java.nio.file.Path;
    import java.nio.file.Paths;
    import java.util.ArrayList;
    import java.util.List;

    /*
     * To change this license header, choose License Headers in Project Properties.
     * To change this template file, choose Tools | Templates
     * and open the template in the editor.
     */

    /**
     *
     * @author 
     */
    public class FileReader {

        public static void main(String args[]) {
            List<String> fileNames = new ArrayList<>();
            try {
                DirectoryStream<Path> directoryStream = Files.newDirectoryStream(Paths.get("files"));
                int fileCounter = 0;
                **WordReader wordCnt = new WordReader();**
                for (Path path : directoryStream) {
                    System.out.println(path.getFileName());
                    fileCounter++;
                    fileNames.add(path.getFileName().toString());
                    **System.out.println("word length: " + fileCounter + " ==> "
                            + wordCnt.count(path.getFileName().toString()));**
                }
            } catch (IOException ex) {
            }
            System.out.println("Count: " + fileNames.size() + " files");

        }
    }

理论上,WordReader类应该计算目录中每个文件中的单词。用lambda语法编写的类:

    import java.io.IOException;
    import java.nio.file.Files;
    import java.nio.file.Path;
    import java.nio.file.Paths;
    import java.util.AbstractMap.SimpleEntry;
    import java.util.Arrays;
    import java.util.Map;
    import static java.util.stream.Collectors.counting;
    import static java.util.stream.Collectors.groupingBy;

        /**
         *
         * @author 
         */
        public class WordReader {

            /**
             *
             * @param filename
             * @return
             * @throws java.io.IOException
             */
            public Map<String, Long> count(String filename) throws IOException {
                //Stream<String> lines = Files.lines(Paths.get(filename));
                Path path = Paths.get(":");
                Map<String, Long> wordMap = Files.lines(path)
                        .parallel()
                        .flatMap(line -> Arrays.stream(line.trim().split(" ")))
                        .map(word -> word.replaceAll("[^a-zA-Z]", "").toLowerCase().trim())
                        .filter(word -> word.length() > 0)
                        .map(word -> new SimpleEntry<>(word, 1))
                        //.collect(Collectors.toMap(s -> s, s -> 1, Integer::sum));
                        .collect(groupingBy(SimpleEntry::getKey, counting()));

                wordMap.forEach((k, v) -> System.out.println(String.format(k,v)));
                return wordMap;
            }
        }

我认为调用WordReader类(在 BOLD 中突出显示)会停止计数器,但我不知道如何解决它并且我试图移动该类调用for循环但没有成功。如果我注释掉行,那么文件计数器运行就好了。有谁知道我可以做这个程序“走(计数文件)和嚼口香糖(计算文件中的单词)”?

1 个答案:

答案 0 :(得分:0)

以下是您犯的一些错误:

  • 仅将文件名传递给count(),因为文件位于目录中,传递整个路径更好。
  • 使用:的路径,即使它不是有效的文件名!
  • 不记录引发的异常,您隐藏了真正的问题。
  • 当你仍然在与大部分的java lang斗争时使用lambda。

这应该有效:

Main.class:

public class Main {
    public static void main(String[] args) {
        List<String> fileNames = new ArrayList<>();
        try {
            DirectoryStream<Path> directoryStream = Files.newDirectoryStream(Paths.get("files"));
            int fileCounter = 0;
            WordReader wordCnt = new WordReader();
            for (Path path : directoryStream) {
                System.out.println(path.getFileName());
                fileCounter++;
                fileNames.add(path.getFileName().toString());
                System.out.println("word length: " + fileCounter + " ==> " + wordCnt.count(path));
            }
        } catch (IOException ex) {
            ex.printStackTrace();
        }
        System.out.println("Count: " + fileNames.size() + " files");
    }
}

WordReader.class:

public class WordReader {
    public Map<String, Integer> count(Path filePath) throws IOException {
        Map<String, Integer> wordMap = Files.lines(filePath)
                .flatMap(line -> Arrays.stream(line.trim().split(" ")))
                .map(word -> word.replaceAll("[^a-zA-Z]", "").toLowerCase().trim())
                .filter(word -> word.length() > 0)
                .collect(Collectors.groupingBy(s->s, Collectors.counting()));

        wordMap.forEach((k, v) -> System.out.println(String.format(k, v)));
        return wordMap;
    }
}