执行多次下载并等待所有人完成

时间:2016-05-07 20:36:52

标签: amazon-s3 java-8 concurrent-programming

我目前正在开发一项API服务,允许一个或多个用户从S3存储桶下载一个或多个项目并将内容返回给用户。虽然下载很好,但下载几个文件的时间差不多是100-150毫秒*文件数。

我已经尝试了一些方法来加速这个 - parallelStream()而不是stream()(考虑到同时下载的数量,是serious risk of running out of threads),以及CompleteableFutures,甚至创建一个ExecutorService,执行下载然后关闭池。通常我只想要一些并发任务,例如5同时,每次请求尝试减少活动线程数。

我尝试集成Spring @Cacheable将下载的文件存储到Redis(文件是只读的) - 虽然这肯定会缩短响应时间(检索文件几百毫秒,相比之下100-150毫秒),但只有那里的好处一旦先前检索过该文件。

处理等待多个异步任务完成然后获得结果的最佳方法是什么,同时考虑到我不想(或者不认为我可以)有数百个线程打开http连接和一次下载所有内容?

2 个答案:

答案 0 :(得分:3)

你很关心在并行流中绑定默认使用的公共fork / join池,因为我相信它用于其他事情,比如Stream api之外的排序操作。您可以为Stream创建自己的fork / join池,而不是使用I / O绑定并行流使公共fork / join池饱和。请参阅this question以了解如何创建具有所需大小的临时ForkJoinPool并在其中运行并行流。

您还可以创建一个具有固定大小线程池的ExecutorService,该线程池也可以独立于公共fork / join池,并且仅使用池中的线程来限制请求。它还允许您指定要专用的线程数:

ExecutorService executor = Executors.newFixedThreadPool(MAX_THREADS_FOR_DOWNLOADS);
try {
    List<CompletableFuture<Path>> downloadTasks = s3Paths
            .stream()
            .map(s3Path -> completableFuture.supplyAsync(() -> mys3Downloader.downloadAndGetPath(s3Path), executor))
            .collect(Collectors.toList());    

        // at this point, all requests are enqueued, and threads will be assigned as they become available      

        executor.shutdown();    // stops accepting requests, does not interrupt threads, 
                                // items in queue will still get threads when available

        // wait for all downloads to complete
        CompletableFuture.allOf(downloadTasks.toArray(new CompletableFuture[downloadTasks.size()])).join();

        // at this point, all downloads are finished, 
        // so it's safe to shut down executor completely

    } catch (InterruptedException | ExecutionException e) {
        e.printStackTrace();
    } finally {
        executor.shutdownNow(); // important to call this when you're done with the executor.
    }

答案 1 :(得分:1)

在@Hank D的领导之后,您可以封装执行程序服务的创建,以确保在使用所述执行程序之后确实调用ExecutorService :: shutdownNow:

private static <VALUE> VALUE execute(
  final int nThreads,
  final Function<ExecutorService, VALUE> function
) {
  ExecutorService executorService = Executors.newFixedThreadPool(nThreads);
  try {
    return function.apply(executorService);
  } catch (final InterruptedException | ExecutionException exception) {
    exception.printStackTrace();
  } finally {
    executorService .shutdownNow(); // important to call this when you're done with the executor service.
  }
}

public static void main(final String... arguments) {
  // define variables
  final List<CompletableFuture<Path>> downloadTasks = execute(
    MAX_THREADS_FOR_DOWNLOADS,
    executor -> s3Paths
      .stream()
      .map(s3Path -> completableFuture.supplyAsync(
        () -> mys3Downloader.downloadAndGetPath(s3Path),
        executor
      ))
      .collect(Collectors.toList())
  );
  // use downloadTasks
}