Question

我无法理解Java 8中的Stream接口，尤其是它与Spliterator和Collector接口有关的地方。我的问题是我根本无法理解Spliterator和Collector接口，因此Stream接口对我来说仍然有点模糊。

Spliterator和Collector究竟是什么，我该如何使用它们？如果我愿意自己编写Spliterator或Collector（可能是我自己的Stream），我应该做什么，不做什么？

我阅读了一些散布在网络上的例子，但由于这里的所有内容都是新的并且可能会有所变化，因此示例和教程仍然非常稀少。

Answer 1

您几乎肯定不会以用户身份处理Spliterator;只有在您自己编写Collection类型并且也打算优化其上的并行操作时，才应该这样做。

对于它的价值，Spliterator是一种对集合元素进行操作的方式，它可以很容易地分离出部分集合，例如：因为你是并行化的，并且希望一个线程在集合的一部分上工作，一个线程在另一部分上工作，等等。

您基本上不应该将类型Stream的值保存到变量中。 Stream有点像Iterator，因为它是一个一次性使用的对象，你几乎总是在流畅的链中使用，就像在Javadoc示例中一样：

int sum = widgets.stream()
                  .filter(w -> w.getColor() == RED)
                  .mapToInt(w -> w.getWeight())
                  .sum();

Collector是la map / reduce中“reduce”操作的最通用，抽象可能的版本;特别是，它需要支持并行化和完成步骤。 Collector的示例包括：

求和，例如Collectors.reducing(0, (x, y) -> x + y)
StringBuilder追加，例如Collector.of(StringBuilder::new, StringBuilder::append, StringBuilder::append, StringBuilder::toString)

Answer 2

Spliterator基本上意味着“可拆分的迭代器”。

单个线程可以遍历/处理整个Spliterator本身，但Spliterator还有一个方法trySplit()，它将“拆分”一个部分供其他人（通常是另一个线程）处理 - 留下当前工作较少的分裂者。

Collector结合了reduce函数（map-reduce fame）的规范，初始值和组合两个结果的值（从而实现了Spliterated工作流的结果，合并。）

例如，最基本的收集器的初始值为0，在现有结果上添加整数，并通过添加它们来“合并”两个结果。因此总结了一个分裂的整数流。

请参阅：

Answer 3

以下是使用预定义收集器执行常见可变减少任务的示例：

 // Accumulate names into a List
 List<String> list = people.stream().map(Person::getName).collect(Collectors.toList());

 // Accumulate names into a TreeSet
 Set<String> set = people.stream().map(Person::getName).collect(Collectors.toCollection(TreeSet::new));

 // Convert elements to strings and concatenate them, separated by commas
 String joined = things.stream()
                       .map(Object::toString)
                       .collect(Collectors.joining(", "));

 // Compute sum of salaries of employee
 int total = employees.stream()
                      .collect(Collectors.summingInt(Employee::getSalary)));

 // Group employees by department
 Map<Department, List<Employee>> byDept
     = employees.stream()
                .collect(Collectors.groupingBy(Employee::getDepartment));

 // Compute sum of salaries by department
 Map<Department, Integer> totalByDept
     = employees.stream()
                .collect(Collectors.groupingBy(Employee::getDepartment,
                                               Collectors.summingInt(Employee::getSalary)));

 // Partition students into passing and failing
 Map<Boolean, List<Student>> passingFailing =
     students.stream()
             .collect(Collectors.partitioningBy(s -> s.getGrade() >= PASS_THRESHOLD));

Answer 4

接口Spliterator-是Streams的核心功能。

stream()界面中提供了parallelStream()和Collection默认方法。这些方法通过调用spliterator()使用Spliterator：

...

default Stream<E> stream() {
    return StreamSupport.stream(spliterator(), false);
}

default Stream<E> parallelStream() {
    return StreamSupport.stream(spliterator(), true);
}

...

Spliterator是一个内部迭代器，可将流分成较小的部分。这些较小的零件可以并行处理。

在其他方法中，有两个最重要的了解Spliterator的方法：

boolean tryAdvance(Consumer<? super T> action) 与Iterator不同，它尝试对下一个元素执行操作。如果操作成功执行，则该方法返回true。否则，返回false-表示没有元素或流的末尾。
Spliterator<T> trySplit() 这种方法可以根据一个或另一个条件（文件大小，行数等）将一组数据分成许多较小的组。

了解Java 8中的Spliterator，Collector和Stream

4 个答案: