我想通过一个分类器使用Java 8 Stream和Group,但是有多个Collector函数。因此,在分组时,例如计算一个字段(或可能是另一个字段)的平均值和总和。
我尝试用一个例子来简化这一点:
public void test() {
List<Person> persons = new ArrayList<>();
persons.add(new Person("Person One", 1, 18));
persons.add(new Person("Person Two", 1, 20));
persons.add(new Person("Person Three", 1, 30));
persons.add(new Person("Person Four", 2, 30));
persons.add(new Person("Person Five", 2, 29));
persons.add(new Person("Person Six", 3, 18));
Map<Integer, Data> result = persons.stream().collect(
groupingBy(person -> person.group, multiCollector)
);
}
class Person {
String name;
int group;
int age;
// Contructor, getter and setter
}
class Data {
long average;
long sum;
public Data(long average, long sum) {
this.average = average;
this.sum = sum;
}
// Getter and setter
}
结果应该是一个与分组结果相关联的地图,如
1 => Data(average(18, 20, 30), sum(18, 20, 30))
2 => Data(average(30, 29), sum(30, 29))
3 => ....
这对于像“Collectors.counting()”这样的函数来说非常合适,但我喜欢链接多个(理想情况下是List的无限)。
List<Collector<Person, ?, ?>>
是否可以做这样的事情?
答案 0 :(得分:15)
对于求和和求平均的具体问题,请使用collectingAndThen
和summarizingDouble
:
Map<Integer, Data> result = persons.stream().collect(
groupingBy(Person::getGroup,
collectingAndThen(summarizingDouble(Person::getAge),
dss -> new Data((long)dss.getAverage(), (long)dss.getSum()))));
对于更通用的问题(收集关于你人员的各种事情),你可以创建一个像这样的复杂收集器:
// Individual collectors are defined here
List<Collector<Person, ?, ?>> collectors = Arrays.asList(
Collectors.averagingInt(Person::getAge),
Collectors.summingInt(Person::getAge));
@SuppressWarnings("unchecked")
Collector<Person, List<Object>, List<Object>> complexCollector = Collector.of(
() -> collectors.stream().map(Collector::supplier)
.map(Supplier::get).collect(toList()),
(list, e) -> IntStream.range(0, collectors.size()).forEach(
i -> ((BiConsumer<Object, Person>) collectors.get(i).accumulator()).accept(list.get(i), e)),
(l1, l2) -> {
IntStream.range(0, collectors.size()).forEach(
i -> l1.set(i, ((BinaryOperator<Object>) collectors.get(i).combiner()).apply(l1.get(i), l2.get(i))));
return l1;
},
list -> {
IntStream.range(0, collectors.size()).forEach(
i -> list.set(i, ((Function<Object, Object>)collectors.get(i).finisher()).apply(list.get(i))));
return list;
});
Map<Integer, List<Object>> result = persons.stream().collect(
groupingBy(Person::getGroup, complexCollector));
映射值是列表,其中第一个元素是应用第一个收集器的结果,依此类推。您可以使用Collectors.collectingAndThen(complexCollector, list -> ...)
添加自定义修整器步骤,以将此列表转换为更合适的名称。
答案 1 :(得分:4)
通过使用地图作为输出类型,可以有一个潜在的无限减速器列表,每个减速器都会生成自己的统计数据并将其添加到地图中。
public static <K, V> Map<K, V> addMap(Map<K, V> map, K k, V v) {
Map<K, V> mapout = new HashMap<K, V>();
mapout.putAll(map);
mapout.put(k, v);
return mapout;
}
...
List<Person> persons = new ArrayList<>();
persons.add(new Person("Person One", 1, 18));
persons.add(new Person("Person Two", 1, 20));
persons.add(new Person("Person Three", 1, 30));
persons.add(new Person("Person Four", 2, 30));
persons.add(new Person("Person Five", 2, 29));
persons.add(new Person("Person Six", 3, 18));
List<BiFunction<Map<String, Integer>, Person, Map<String, Integer>>> listOfReducers = new ArrayList<>();
listOfReducers.add((m, p) -> addMap(m, "Count", Optional.ofNullable(m.get("Count")).orElse(0) + 1));
listOfReducers.add((m, p) -> addMap(m, "Sum", Optional.ofNullable(m.get("Sum")).orElse(0) + p.i1));
BiFunction<Map<String, Integer>, Person, Map<String, Integer>> applyList
= (mapin, p) -> {
Map<String, Integer> mapout = mapin;
for (BiFunction<Map<String, Integer>, Person, Map<String, Integer>> f : listOfReducers) {
mapout = f.apply(mapout, p);
}
return mapout;
};
BinaryOperator<Map<String, Integer>> combineMaps
= (map1, map2) -> {
Map<String, Integer> mapout = new HashMap<>();
mapout.putAll(map1);
mapout.putAll(map2);
return mapout;
};
Map<String, Integer> map
= persons
.stream()
.reduce(new HashMap<String, Integer>(),
applyList, combineMaps);
System.out.println("map = " + map);
制作:
map = {Sum=10, Count=6}
答案 2 :(得分:3)
你可以链接它们,
收集器只能生成一个对象,但此对象可以包含多个值。例如,您可以返回一个Map,其中地图为您要返回的每个收集器都有一个条目。
您可以使用Collectors.of(HashMap::new, accumulator, combiner);
您的accumulator
会有一个收集者地图,其中所生成的地图的键与收集者的名称相匹配。当并行执行时,组合器需要一种方法来组合多个结果esp。
通常,内置收集器使用数据类型来获得复杂的结果。
来自收藏家
public static <T>
Collector<T, ?, DoubleSummaryStatistics> summarizingDouble(ToDoubleFunction<? super T> mapper) {
return new CollectorImpl<T, DoubleSummaryStatistics, DoubleSummaryStatistics>(
DoubleSummaryStatistics::new,
(r, t) -> r.accept(mapper.applyAsDouble(t)),
(l, r) -> { l.combine(r); return l; }, CH_ID);
}
并在其自己的班级
public class DoubleSummaryStatistics implements DoubleConsumer {
private long count;
private double sum;
private double sumCompensation; // Low order bits of sum
private double simpleSum; // Used to compute right sum for non-finite inputs
private double min = Double.POSITIVE_INFINITY;
private double max = Double.NEGATIVE_INFINITY;
答案 3 :(得分:3)
您应该构建一个抽象,它是收集器的聚合器,而不是链接收集器:使用接受收集器列表的类实现Collector
接口,并将每个方法调用委托给每个收集器。然后,最后,返回new Data()
,其中包含嵌套收集器生成的所有结果。
您可以通过使用Collector.of(supplier, accumulator, combiner, finisher, Collector.Characteristics... characteristics)
来避免使用所有方法声明创建自定义类。finisher
lambda将调用每个嵌套收集器的终结符,然后返回Data
实例。
答案 4 :(得分:0)
在 Java12 中,收集器 API 已使用静态 teeing(...) 函数进行扩展:
<块引用>teeing (Collector super T, ?, R1>下游1, 收藏家 下游 2, 双函数 合并)
这提供了一种内置功能,可以在一个 Stream 上使用两个收集器并将结果合并到一个对象中。
下面是一个小示例,其中将员工列表分成年龄组,对于每个组,对年龄和薪水执行的两个 Collectors.summarizingInt() 作为 IntSummaryStatistics 列表返回:
import java.util.*;
import java.util.function.Function;
import java.util.stream.Collectors;
public class CollectorTeeingTest {
public static void main(String... args){
NavigableSet<Integer> age_groups = new TreeSet<>();
age_groups.addAll(List.of(30,40,50,60,Integer.MAX_VALUE)); //we don't want to map to null
Function<Integer,Integer> to_age_groups = age -> age_groups.higher(age);
List<Employee> employees = List.of( new Employee("A",21,2000),
new Employee("B",24,2400),
new Employee("C",32,3000),
new Employee("D",40,4000),
new Employee("E",41,4100),
new Employee("F",61,6100)
);
Map<Integer,List<IntSummaryStatistics>> stats = employees.stream()
.collect(Collectors.groupingBy(
employee -> to_age_groups.apply(employee.getAge()),
Collectors.teeing(
Collectors.summarizingInt(Employee::getAge),
Collectors.summarizingInt(Employee::getSalary),
(stat1, stat2) -> List.of(stat1,stat2))));
stats.entrySet().stream().forEach(entry -> {
System.out.println("Age-group: <"+entry.getKey()+"\n"+entry.getValue());
});
}
public static class Employee{
private final String name;
private final int age;
private final int salary;
public Employee(String name, int age, int salary){
this.name = name;
this.age = age;
this.salary = salary;
}
public String getName(){return this.name;}
public int getAge(){return this.age;}
public int getSalary(){return this.salary;}
}
}
输出:
Age-group: <2147483647
[IntSummaryStatistics{count=1, sum=61, min=61, average=61,000000, max=61}, IntSummaryStatistics{count=1, sum=6100, min=6100, average=6100,000000, max=6100}]
Age-group: <50
[IntSummaryStatistics{count=2, sum=81, min=40, average=40,500000, max=41}, IntSummaryStatistics{count=2, sum=8100, min=4000, average=4050,000000, max=4100}]
Age-group: <40
[IntSummaryStatistics{count=1, sum=32, min=32, average=32,000000, max=32}, IntSummaryStatistics{count=1, sum=3000, min=3000, average=3000,000000, max=3000}]
Age-group: <30
[IntSummaryStatistics{count=2, sum=45, min=21, average=22,500000, max=24}, IntSummaryStatistics{count=2, sum=4400, min=2000, average=2200,000000, max=2400}]