使用Java 8 Stream

时间:2016-10-10 06:57:03

标签: java file java-8 java-stream

Java 8有一种从文件行创建Stream的方法。在这种情况下,foreach将逐步执行。我有一个格式如下的文本文件..

bunch of lines with text
$$$$
bunch of lines with text
$$$$

我需要将$$$$之前的每一行都放到Stream中的单个元素中。

换句话说,我需要一串字符串。每个字符串都包含$$$$之前的内容。

执行此操作的最佳方式(最小开销)是什么?

5 个答案:

答案 0 :(得分:2)

我无法想出一个懒洋洋地处理这些行的解决方案。我不确定这是否可行。

我的解决方案产生ArrayList。如果您必须使用Stream,只需在其上调用stream()

public class DelimitedFile {
    public static void main(String[] args) throws IOException {
        List<String> lines = lines(Paths.get("delimited.txt"), "$$$$");
        for (int i = 0; i < lines.size(); i++) {
            System.out.printf("%d:%n%s%n", i, lines.get(i));
        }
    }

    public static List<String> lines(Path path, String delimiter) throws IOException {
        return Files.lines(path)
                .collect(ArrayList::new, new BiConsumer<ArrayList<String>, String>() {
                    boolean add = true;

                    @Override
                    public void accept(ArrayList<String> lines, String line) {
                        if (delimiter.equals(line)) {
                            add = true;
                        } else {
                            if (add) {
                                lines.add(line);
                                add = false;
                            } else {
                                int i = lines.size() - 1;
                                lines.set(i, lines.get(i) + '\n' + line);
                            }
                        }
                    }
                }, ArrayList::addAll);
    }
}

文件内容:

bunch of lines with text
bunch of lines with text2
bunch of lines with text3
$$$$
2bunch of lines with text
2bunch of lines with text2
$$$$
3bunch of lines with text
3bunch of lines with text2
3bunch of lines with text3
3bunch of lines with text4
$$$$

输出:

0:
bunch of lines with text
bunch of lines with text2
bunch of lines with text3
1:
2bunch of lines with text
2bunch of lines with text2
2:
3bunch of lines with text
3bunch of lines with text2
3bunch of lines with text3
3bunch of lines with text4

修改

我终于想出了一个懒惰地生成Stream的解决方案:

public static Stream<String> lines(Path path, String delimiter) throws IOException {
    Stream<String> lines = Files.lines(path);
    Iterator<String> iterator = lines.iterator();
    return StreamSupport.stream(Spliterators.spliteratorUnknownSize(new Iterator<String>() {
        String nextLine;

        @Override
        public boolean hasNext() {
            if (nextLine != null) {
                return true;
            }
            while (iterator.hasNext()) {
                String line = iterator.next();
                if (!delimiter.equals(line)) {
                    nextLine = line;
                    return true;
                }
            }
            lines.close();
            return false;
        }

        @Override
        public String next() {
            if (!hasNext()) {
                throw new NoSuchElementException();
            }
            StringBuilder sb = new StringBuilder(nextLine);
            nextLine = null;
            while (iterator.hasNext()) {
                String line = iterator.next();
                if (delimiter.equals(line)) {
                    break;
                }
                sb.append('\n').append(line);
            }
            return sb.toString();
        }
    }, Spliterator.ORDERED | Spliterator.NONNULL | Spliterator.IMMUTABLE), false);
}

这实际上/恰巧与BufferedReader.lines()的实现非常相似(Files.lines(Path)在内部使用)。不使用这两种方法可能会减少开销,而是直接使用Files.newBufferedReader(Path)BufferedReader.readLine()

答案 1 :(得分:0)

你可以尝试

    List<String> list = new ArrayList<>();
    try (Stream<String> stream = Files.lines(Paths.get(fileName))) {
            list = stream
                .filter(line -> !line.equals("$$$$"))
                .collect(Collectors.toList());
    } catch (IOException e) {
        e.printStackTrace();
    }

答案 2 :(得分:0)

已经存在类似的较短答案,但是type.safe如下,没有额外的状态:

    Path path = Paths.get("... .txt");
    try {
        List<StringBuilder> glist = Files.lines(path, StandardCharsets.UTF_8)
                .collect(() -> new ArrayList<StringBuilder>(),
                        (list, line) -> {
                            if (list.isEmpty() || list.get(list.size() - 1).toString().endsWith("$$$$\n")) {
                                list.add(new StringBuilder());
                            }
                            list.get(list.size() - 1).append(line).append('\n');
                        },
                        (list1, list2) -> {
                            if (!list1.isEmpty() && !list1.get(list1.size() - 1).toString().endsWith("$$$$\n")
                                    && !list2.isEmpty()) {
                                // Merge last of list1 and first of list2:
                                list1.get(list1.size() - 1).append(list2.remove(0).toString());
                            }
                            list1.addAll(list2);
                        });
        glist.forEach(sb -> System.out.printf("------------------%n%s%n", sb));
    } catch (IOException ex) {
        Logger.getLogger(App.class.getName()).log(Level.SEVERE, null, ex);
    }

而不是.endsWith("$$$$\n"),最好这样做:

.matches("(^|\n)\\$\\$\\$\\$\n")

答案 3 :(得分:0)

此处基于this previous work的解决方案:

public class ChunkSpliterator extends Spliterators.AbstractSpliterator<List<String>> {
    private final Spliterator<String> source;
    private final Predicate<String> delimiter;
    private final Consumer<String> getChunk;
    private List<String> current;

    ChunkSpliterator(Spliterator<String> lineSpliterator, Predicate<String> mark) {
        super(lineSpliterator.estimateSize(), ORDERED|NONNULL);
        source=lineSpliterator;
        delimiter=mark;
        getChunk=s -> {
            if(current==null) current=new ArrayList<>();
            current.add(s);
        };
    }
    public boolean tryAdvance(Consumer<? super List<String>> action) {
        while(current==null || !delimiter.test(current.get(current.size()-1)))
            if(!source.tryAdvance(getChunk)) return lastChunk(action);
        current.remove(current.size()-1);
        action.accept(current);
        current=null;
        return true;
    }
    private boolean lastChunk(Consumer<? super List<String>> action) {
        if(current==null) return false;
        action.accept(current);
        current=null;
        return true;
    }

    public static Stream<List<String>> toChunks(
        Stream<String> lines, Predicate<String> splitAt, boolean parallel) {
        return StreamSupport.stream(
            new ChunkSpliterator(lines.spliterator(), splitAt),
            parallel);
    }
}

你可以使用

try(Stream<String> lines=Files.lines(pathToYourFile)) {
    ChunkSpliterator.toChunks(
        lines,
        Pattern.compile("^\\Q$$$$\\E$").asPredicate(),
        false)
    /* chain your stream operations, e.g.
    .forEach(s -> { s.forEach(System.out::print); System.out.println(); })
     */;
}

答案 4 :(得分:0)

您可以使用Scanner作为迭代器并从中创建流:

private static Stream<String> recordStreamOf(Readable source) {
    Scanner scanner = new Scanner(source);
    scanner.useDelimiter("$$$$");
    return StreamSupport
        .stream(Spliterators.spliteratorUnknownSize(scanner, Spliterator.ORDERED | Spliterator.NONNULL), false)
        .onClose(scanner::close);
}

这将保留块中的换行符以进行进一步过滤或拆分。