过滤掉CompletableFuture的重复项

时间:2018-05-22 09:19:48

标签: java java-stream distinct completable-future

我想在第一个CompletableFuture之后过滤掉重复项,然后使用另一个CompletableFuture调用第二个阶段。我尝试了什么:

@FunctionalInterface
public interface FunctionWithExceptions<T, R, E extends Exception> {
    R process(T t) throws E;
}


public static <T> Predicate<T> distinctByKey(FunctionWithExceptions<? super T, ?, ?> keyExtractor) {
    Set<Object> seen = ConcurrentHashMap.newKeySet();
    return t -> {
        String key = "";
        try {
            key = (String) keyExtractor.process(t);
        } catch (Exception e) {
            log.info("Get instanceIp failed!");
        }
        return seen.add(key);
    };
}

List<CompletableFuture<InstanceDo>> instanceFutures = podNames.stream()
            .map(podName -> CompletableFuture.supplyAsync(RethrowExceptionUtil.rethrowSupplier(() -> {
                PodDo podDo = getPodRetriever().getPod(envId, podName);
                podDoList.add(podDo);
                return podDo;
            }), executor))
            .map(future -> future.thenApply(podDo -> podDo.getInstanceName()))
            .filter(distinctByKey(CompletableFuture::get))
            .map(future -> future.thenCompose(instanceName ->
                    CompletableFuture.supplyAsync(() -> get(envId, instanceName), executor)))
            .collect(Collectors.toList());

正如您所看到的,distinctByKey调用 get,这将直接使并发变为顺序性

我应该怎么做再次 CONCURRENT ,同时保留不同功能?

我只有一个选择?

等待整个第一阶段完成,然后启动第二阶段

2 个答案:

答案 0 :(得分:0)

我刚刚写了一个简单的演示来解决这类问题,但我确实知道它是否可靠。但至少可以确保使用Set<Object> seen = ConcurrentHashMap.newKeySet();加速第二阶段。

public static void main(String... args) throws ExecutionException, InterruptedException {
        Set<Object> seen = ConcurrentHashMap.newKeySet();
        List<CompletableFuture<Integer>> intFutures = Stream.iterate(0, i -> i+1)
                .limit(5)
                .map(i -> CompletableFuture.supplyAsync(() -> {
                    int a = runStage1(i);
                    if (seen.add(a)) {
                        return a;
                    } else {
                        return -1;
                    }}))
                .map(future -> future.thenCompose(i -> CompletableFuture.supplyAsync(() -> {
                    if (i > 0) {
                        return runStage2(i);
                    } else {
                        return i;
                    }})))
                .collect(Collectors.toList());
        List<Integer> resultList = new ArrayList<>();
        try {
            for (CompletableFuture<Integer> future: intFutures) {
                resultList.add(future.join());
            }
        } catch (Exception ignored) {
            ignored.printStackTrace();
            out.println("Future failed!");
        }
        resultList.stream().forEach(out::println);
    }

    private static Integer runStage1(int a) {
        out.println("stage - 1: " + a);
        try {
            Thread.sleep(500 + Math.abs(new Random().nextInt()) % 1000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        return Integer.valueOf(a % 3);
    }

    private static Integer runStage2(int b) {
        out.println("stage - 2: " + b);
        try {
            Thread.sleep(200 + Math.abs(new Random().nextInt()) % 1000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }

        return Integer.valueOf(b);
    }

重复 第二阶段的特殊值中返回特殊值,然后在第二阶段返回特殊值(-1),我可以忽略耗时的第二阶段计算。

输出确实滤除了第二阶段的一些冗余计算。

stage - 1: 0
stage - 1: 1
stage - 1: 2
stage - 1: 3
stage - 2: 2 // 
stage - 2: 1 //
stage - 1: 4
0
1
2
-1
-1

我认为这不是一个好的解决方案。但是我可以优化什么才能让它变得更好?

答案 1 :(得分:0)

your submitted answer相比较小的改进可能是使用ConcurrentHashMap作为一种缓存,因此您的最终列表包含的结果与您获得的顺序无关:

Map<Integer, CompletableFuture<Integer>> seen = new ConcurrentHashMap<>();
List<CompletableFuture<Integer>> intFutures = Stream.iterate(0, i -> i + 1)
        .limit(5)
        .map(i -> CompletableFuture.supplyAsync(() -> runStage1(i)))
        .map(cf -> cf.thenCompose(result ->
                seen.computeIfAbsent(
                        result, res -> CompletableFuture.supplyAsync(() -> runStage2(res))
                )
        ))
        .collect(Collectors.toList());

请注意,传递给computeIfAbsent()的函数必须立即返回(例如使用supplyAsync()),因为它在执行时会在地图内部保持锁定。此外,此功能不得尝试修改seen地图,因为it could cause issues

通过此更改,输出可以是例如:

stage - 1: 1
stage - 1: 0
stage - 1: 2
stage - 2: 1
stage - 2: 2
stage - 1: 3
stage - 2: 0
stage - 1: 4
0
1
2
0
1

此外,这样可以在所有期货完成后检查seen地图,以获得独特的结果。