为什么CompletableFuture在单独的流中加入/获取速度要比使用一个流更快

时间:2019-11-04 20:27:27

标签: java java-8 java-stream completable-future

对于以下程序,我试图弄清楚为什么使用2个不同的流并行处理任务,并使用相同的流并在Completable future上调用join / get会使它们花费更长的时间,就好像它们被顺序处理一样。

public class HelloConcurrency {

    private static Integer sleepTask(int number) {
        System.out.println(String.format("Task with sleep time %d", number));
        try {
            TimeUnit.SECONDS.sleep(number);
        } catch (InterruptedException e) {
            e.printStackTrace();
            return -1;
        }
        return number;
    }

    public static void main(String[] args) {
        List<Integer> sleepTimes = Arrays.asList(1,2,3,4,5,6);
        System.out.println("WITH SEPARATE STREAMS FOR FUTURE AND JOIN");
        ExecutorService executorService = Executors.newFixedThreadPool(6);
        long start = System.currentTimeMillis();
        List<CompletableFuture<Integer>> futures = sleepTimes.stream()
                .map(sleepTime -> CompletableFuture.supplyAsync(() -> sleepTask(sleepTime), executorService)
                        .exceptionally(ex -> { ex.printStackTrace(); return -1; }))
                .collect(Collectors.toList());
        executorService.shutdown();
        List<Integer> result = futures.stream()
                .map(CompletableFuture::join)
                .collect(Collectors.toList());
        long finish = System.currentTimeMillis();
        long timeElapsed = (finish - start)/1000;
        System.out.println(String.format("done in %d seconds.", timeElapsed));
        System.out.println(result);

        System.out.println("WITH SAME STREAM FOR FUTURE AND JOIN");
        ExecutorService executorService2 = Executors.newFixedThreadPool(6);
        start = System.currentTimeMillis();
        List<Integer> results = sleepTimes.stream()
                .map(sleepTime -> CompletableFuture.supplyAsync(() -> sleepTask(sleepTime), executorService2)
                        .exceptionally(ex -> { ex.printStackTrace(); return -1; }))
                .map(CompletableFuture::join)
                .collect(Collectors.toList());
        executorService2.shutdown();
        finish = System.currentTimeMillis();
        timeElapsed = (finish - start)/1000;
        System.out.println(String.format("done in %d seconds.", timeElapsed));
        System.out.println(results);
    }
}

输出

WITH SEPARATE STREAMS FOR FUTURE AND JOIN
Task with sleep time 6
Task with sleep time 5
Task with sleep time 1
Task with sleep time 3
Task with sleep time 2
Task with sleep time 4
done in 6 seconds.
[1, 2, 3, 4, 5, 6]
WITH SAME STREAM FOR FUTURE AND JOIN
Task with sleep time 1
Task with sleep time 2
Task with sleep time 3
Task with sleep time 4
Task with sleep time 5
Task with sleep time 6
done in 21 seconds.
[1, 2, 3, 4, 5, 6]

3 个答案:

答案 0 :(得分:6)

两种方法有很大的不同,让我尝试清楚地解释一下

第一种方法::在第一种方法中,您将处理所有6个任务的所有Async请求,然后在每个任务上调用join函数以获取结果

第二种方法:但是,在第二种方法中,您在旋转每个任务的join请求之后立即调用Async。例如,将任务Async的{​​{1}}线程旋转为调用1之后,请确保该线程完成任务,然后仅使用join线程启动第二个任务

注意:另一方面,如果您清楚地观察到输出,由于所有六个任务都是异步执行的,因此在第一种方法中,输出以随机顺序出现。但是在第二种方法中,所有任务都依次执行。

我相信您已经知道如何执行流Async的操作,或者您可以从herehere中获取更多信息

  

要执行计算,将流操作组合到流管道中。流管道包括源(可能是数组,集合,生成器函数,I / O通道等),零个或多个中间操作(将一个流转换为另一个流,例如filter(Predicate))组成)和终端操作(产生结果或副作用,例如count()或forEach(Consumer))。 信息流是惰性的;仅在启动终端操作时才对源数据进行计算,并且仅在需要时才使用源元素。

答案 1 :(得分:2)

流框架未定义在流元素上执行map操作的顺序,因为它不适用于可能是相关问题的用例。因此,您的第二个版本执行的特定方式实质上等同于

List<Integer> results = new ArrayList<>();
for (Integer sleepTime : sleepTimes) {
  results.add(CompletableFuture
     .supplyAsync(() -> sleepTask(sleepTime), executorService2)
     .exceptionally(ex -> { ex.printStackTrace(); return -1; }))
     .join());
}

...本质上等同于

List<Integer> results = new ArrayList<>()
for (Integer sleepTime : sleepTimes) {
  results.add(sleepTask(sleepTime));
}

答案 2 :(得分:1)

@Deadpool回答得很好,只需添加我的回答即可帮助某人更好地理解它。

通过向这两种方法添加更多打印,我得以得到答案。

TLDR

  • 2种流方法::我们正在异步启动所有6个任务,然后对它们中的每一个调用join函数以在单独的流中获得结果。

  • 一种流方法::我们在启动每个任务后立即调用联接。例如,在为任务1旋转线程之后,调用join可以确保该线程等待任务1的完成,然后仅使用异步线程启动第二个任务。

注意:另外,如果我们清楚地观察到输出,则在1流方法中,由于所有六个任务均按顺序执行,因此输出按顺序显示。但是在第二种方法中,所有任务都是并行执行的,因此是随机的。

注释2 :如果在1流方法中将stream()替换为parallelStream(),它将与2流方法相同。

更多证据

我在流中添加了更多打印,从而得到以下输出并确认了上面的注释:

1个流:

List<Integer> results = sleepTimes.stream()
                .map(sleepTime -> CompletableFuture.supplyAsync(() -> sleepTask(sleepTime), executorService2)
                        .exceptionally(ex -> { ex.printStackTrace(); return -1; }))
                .map(f  -> {
                    int num = f.join();
                    System.out.println(String.format("doing join on task %d", num));
                    return num;
                })
                .collect(Collectors.toList());



WITH SAME STREAM FOR FUTURE AND JOIN
Task with sleep time 1
doing join on task 1
Task with sleep time 2
doing join on task 2
Task with sleep time 3
doing join on task 3
Task with sleep time 4
doing join on task 4
Task with sleep time 5
doing join on task 5
Task with sleep time 6
doing join on task 6
done in 21 seconds.
[1, 2, 3, 4, 5, 6]

2个流:

List<CompletableFuture<Integer>> futures = sleepTimes.stream()
          .map(sleepTime -> CompletableFuture.supplyAsync(() -> sleepTask(sleepTime), executorService)
                  .exceptionally(ex -> { ex.printStackTrace(); return -1; }))
          .collect(Collectors.toList());

List<Integer> result = futures.stream()
            .map(f  -> {
                int num = f.join();
                System.out.println(String.format("doing join on task %d", num));
                return num;
            })
            .collect(Collectors.toList());



WITH SEPARATE STREAMS FOR FUTURE AND JOIN
Task with sleep time 2
Task with sleep time 5
Task with sleep time 3
Task with sleep time 1
Task with sleep time 4
Task with sleep time 6
doing join on task 1
doing join on task 2
doing join on task 3
doing join on task 4
doing join on task 5
doing join on task 6
done in 6 seconds.
[1, 2, 3, 4, 5, 6]