Question

我想用Java 8流复制和并行化以下行为：

for (animal : animalList) {
        // find all other animals with the same breed
        Collection<Animal> queryResult = queryDatabase(animal.getBreed());

        if (animal.getSpecie() == cat) {
            catList.addAll(queryResult);
        } else {
            dogList.addAll(queryResult);
        }
}

这是我到目前为止所拥有的

final Executor queryExecutor =
        Executors.newFixedThreadPool(Math.min(animalList.size(), 10),
                new ThreadFactory(){
                    public Thread newThread(Runnable r){
                        Thread t = new Thread(r);
                        t.setDaemon(true);
                        return t;
                    }
                });

List<CompletableFuture<Collection<Animal>>> listFutureResult =  animalList.stream()
        .map(animal -> CompletableFuture.supplyAsync(
                () -> queryDatabase(animal.getBreed()), queryExecutor))
        .collect(Collectors.toList());

List<Animal> = listFutureResult.stream()
        .map(CompletableFuture::join)
        .flatMap(subList -> subList.stream())
        .collect(Collectors.toList());

1 - 我不知道如何拆分流以便我可以获得2个不同的动物列表，一个用于猫，一个用于狗。

2 - 这个解决方案看起来合理吗？

Answer 1

首先，考虑使用

List<Animal> result = animalList.parallelStream()
    .flatMap(animal -> queryDatabase(animal.getBreed()).stream())
    .collect(Collectors.toList());

即使它不会提供最多十个所需的并发性。简单性可能会弥补它。关于另一部分，它就像

一样简单

Map<Boolean,List<Animal>> result = animalList.parallelStream()
    .flatMap(animal -> queryDatabase(animal.getBreed()).stream())
    .collect(Collectors.partitioningBy(animal -> animal.getSpecie() == cat));
List<Animal> catList = result.get(true), dogList = result.get(false);

如果您拥有的物种多于猫和狗，您可以使用Collectors.groupingBy(Animal::getSpecie)获取从物种到动物列表的地图。

如果您坚持使用自己的线程池，可以改进一些事项：

Executor queryExecutor = Executors.newFixedThreadPool(Math.min(animalList.size(), 10),
    r -> {
        Thread t = new Thread(r);
        t.setDaemon(true);
        return t;
    });
List<Animal> result =  animalList.stream()
    .map(animal -> CompletableFuture.completedFuture(animal.getBreed())
        .thenApplyAsync(breed -> queryDatabase(breed), queryExecutor))
    .collect(Collectors.toList()).stream()
    .flatMap(cf -> cf.join().stream())
    .collect(Collectors.toList());

您的supplyAsync变体需要捕获实际的Animal实例，为每只动物创建一个新的Supplier。相反，传递给thenApplyAsync的函数是不变的，对每个参数值执行相同的操作。上面的代码假设getBreed是一个廉价的操作，否则，将Animal实例传递给completedFuture并使用异步函数执行getBreed()并不困难

.map(CompletableFuture::join)可以替换为.join()函数中的简单链式flatMap。否则，如果您更喜欢方法引用，则应始终如一地使用它们，即.map(CompletableFuture::join).flatMap(Collection::stream)。

当然，此变体还允许使用partitioningBy代替toList。

最后请注意，如果在使用后在执行程序服务上调用shutdown，则无需将线程标记为守护程序：

ExecutorService queryExecutor=Executors.newFixedThreadPool(Math.min(animalList.size(),10));
Map<Boolean,List<Animal>> result =  animalList.stream()
    .map(animal -> CompletableFuture.completedFuture(animal.getBreed())
        .thenApplyAsync(breed -> queryDatabase(breed), queryExecutor))
    .collect(Collectors.toList()).stream()
    .flatMap(cf -> cf.join().stream())
    .collect(Collectors.partitioningBy(animal -> animal.getSpecie() == cat));
List<Animal> catList = result.get(true), dogList = result.get(false);
queryExecutor.shutdown();

使用Java 8流和CompletableFuture进行并行数据库调用

1 个答案: