在RxJava / RxScala中组合groupBy和flatMap(maxConcurrent,...)

时间:2017-01-17 09:10:26

标签: scala rx-java reactive-programming rx-scala

我有传入的处理请求,我希望由于耗尽共享资源而不希望同时进行太多处理。我也希望共享一些唯一密钥的请求不能同时执行:

def process(request: Request): Observable[Answer] = ???

requestsStream
  .groupBy(request => request.key)
  .flatMap(maxConcurrentProcessing, { case (key, requestsForKey) => 
      requestsForKey
         .flatMap(1, process)
  })

但是,上述方法不起作用,因为每个键的observable永远不会完成。实现这一目标的正确方法是什么?

什么行不通:

  .flatMap(maxConcurrentProcessing, { case (key, requestsForKey) => 
      // Take(1) unsubscribes after the first, causing groupBy to create a new observable, causing the next request to execute concurrently
      requestsForKey.take(1)
         .flatMap(1, process)
  })

 .flatMap(maxConcurrentProcessing, { case (key, requestsForKey) =>
      // The idea was to unsubscribe after 100 milliseconds to "free up" maxConcurrentProcessing
      // This discards all requests after the first if processing takes more than 100 milliseconds
      requestsForKey.timeout(100.millis, Observable.empty)
         .flatMap(1, process)
  })

1 个答案:

答案 0 :(得分:1)

在这里,我是如何实现这一目标的。对于每个唯一键,我分配专用的单线程调度程序(以便按顺序处理具有相同键的消息):

@Test
public void groupBy() throws InterruptedException {
    final int NUM_GROUPS = 10;
    Observable.interval(1, TimeUnit.MILLISECONDS)
            .map(v -> {
                logger.info("received {}", v);
                return v;
            })
            .groupBy(v -> v % NUM_GROUPS)
            .flatMap(grouped -> {
                long key = grouped.getKey();
                logger.info("selecting scheduler for key {}", key);
                return grouped
                        .observeOn(assignScheduler(key))
                        .map(v -> {
                            String threadName = Thread.currentThread().getName();
                            Assert.assertEquals("proc-" + key, threadName);
                            logger.info("processing {} on {}", v, threadName);
                            return v;
                        })
                        .observeOn(Schedulers.single()); // re-schedule
            })
            .subscribe(v -> logger.info("got {}", v));

    Thread.sleep(1000);
}

在我的情况下,键的数量(NUM_GROUPS)很小,所以我为每个键创建了专用的调度程序:

Scheduler assignScheduler(long key) {
    return Schedulers.from(Executors.newSingleThreadExecutor(
        r -> new Thread(r, "proc-" + key)));
}

如果密钥的数量无限或太大而无法为每个密钥专用一个线程,您可以创建一个调度程序池并重复使用它们:

Scheduler assignScheduler(long key) {
    // assign randomly
    return poolOfSchedulers[random.nextInt(SIZE_OF_POOL)];
}