Question

在以下课程中：我想在List<HelloWorld> helloWorldList

中获得foo和bar的平均值

@Data
public class HelloWorld {
    private Long foo;
    private Long bar;
}

选项1：JAVA

Long fooSum, barSum;
for(HelloWorld hw: helloWorldList){
    fooSum += hw.getFoo();
    barSum += hw.getBar();
}
Long fooAvg = fooSum/helloWorldList.size();
Long barAvg = barSum/helloWorldList.size();

选项2：JAVA 8

Double fooAvg = helloWorldList.stream().mapToLong(HelloWorld::foo).average().orElse(null);
Double barAvg = helloWorldList.stream().mapToLong(HelloWorld::bar).average().orElse(null);

哪种方法更好？有没有更好的方法来获得这些价值？

回答编辑：此问题已被标记为重复，但在阅读了bradimus的评论后，我最终实现了这一点：

import java.util.function.Consumer;
public class HelloWorldSummaryStatistics implements Consumer<HelloWorld> {
    @Getter
    private int fooTotal = 0;
    @Getter
    private int barTotal = 0;
    @Getter
    private int count = 0;

    public HelloWorldSummaryStatistics() {
    }

    @Override
    public void accept(HelloWorld helloWorld) {
        fooTotal += helloWorld.getFoo();
        barTotal += helloWorld.getBar();
        count++;
    }

    public void combine(HelloWorldSummaryStatistics other) {
        fooTotal += other.fooTotal;
        barTotal += other.barTotal;
        count += other.count;
    }

    public final double getFooAverage() {
        return getCount() > 0 ? (double) getFooTotal() / getCount() : 0.0d;
    }

    public final double getBarAverage() {
        return getCount() > 0 ? (double) getBarTotal() / getCount() : 0.0d;
    }

    @Override
    public String toString() {
        return String.format(
            "%s{count=%d, fooAverage=%f, barAverage=%f}",
            this.getClass().getSimpleName(),
            getCount(),
            getFooAverage(),
            getBarAverage());
    }
}

主要类别：

    HelloWorld a = new HelloWorld(5L, 1L);
    HelloWorld b = new HelloWorld(5L, 2L);
    HelloWorld c = new HelloWorld(5L, 4L);
    List<HelloWorld> hwList = Arrays.asList(a, b, c);
    HelloWorldSummaryStatistics helloWorldSummaryStatistics = hwList.stream()
            .collect(HelloWorldSummaryStatistics::new, HelloWorldSummaryStatistics::accept, HelloWorldSummaryStatistics::combine);
    System.out.println(helloWorldSummaryStatistics);

注意：正如其他人所建议的那样，如果你需要高精度，可以使用BigInteger等。

Answer 1

到目前为止，您所获得的答案/评论并未提及基于流的解决方案的一个优势：只需将stream()更改为parallelStream()，您就可以将整个事情变为多个线程解决方案。

尝试使用＆＃34;选项1＆＃34 ;;并看看它需要多少工作。

但当然，这意味着更多＆＃34;开销＆＃34;在＆＃34;封面背后的事情上花费了CPU周期＆＃34 ;;但如果您正在谈论大型数据集，它实际上可能会让您受益。

至少你可以很容易地看到启用parallelStreams（）会如何影响执行时间！

Answer 2

如果要在整数列表中找到平均值，最好使用经典方法进行迭代。流有一些开销，JVM必须加载类以用于流使用。但JVM也有JIT进行了大量的优化。

请注意不正确的banchmarking。使用JMH 当你的迭代操作不像两个整数和这样简单的事情时，流是好的和有效的。流还允许您并行化代码。并行化优于单线程时没有直接的标准。至于我 - 如果函数调用超过100毫秒 - 你可以将它并行化。

如此，如果您的数据集处理时间> 100毫秒，请尝试product_id

如果不是 - 请使用迭代。

P.S。 Doug Lea - ＆＃34; When to use parallel streams＆＃34;

Answer 3

哪种方法更好？

当你说'＃34;更好＆＃34;时，你的意思是＆＃34;更接近样本的真实平均值＆＃34;或者＆＃34;效率更高＆＃34;或者是什么？如果效率是您的目标，则流需要相当大的开销，而这通常会被忽略。但是，它们提供了可读性和更简洁的代码。这取决于您尝试最大化的内容，数据集的大小等等。

也许改一下这个问题？

Java 8：获得多个属性的平均值

3 个答案: