Question

我有一个包含500,000个元素的列表和一个包含20个消费者的队列。消息以不同的速度处理（1,15,30,60秒; 3,50分钟; 3,16小时或更长时间.24小时是超时）。我需要消费者的响应才能对数据进行一些处理。我将使用Scala Future进行此操作，并使用基于事件的onComplete。

为了不淹没队列，我想先向队列发送30条消息：消费者将挑选20条消息，10条消息将在队列中等待。当其中一个Future完成后，我想向队列发送另一条消息。你能告诉我如何实现这个目标吗？可以用Akka Streams完成吗？

这是错的，我只是想让你知道我想要什么：

private def sendMessage(ids: List[String]): Unit = {
  val id = ids.head

  val futureResult = Future {
    //send id among some message to the queue
  }.map { result =>
    //process the response
  }

  futureResult.onComplete { _ =>
    sendMessage(ids.tail)
  }
}

def migrateAll(): Unit = {
  val ids: List[String] = //get IDs from the DB

  sendMessage(ids)
}

Answer 1

这是我用于此类任务的代码

class RateLimiter(semaphore: Semaphore) {
  def runBlocking[T](action: => Future[T]): Future[T] = {
    semaphore.acquire()
    val started = try {
      action
    }
    catch {
      case NonFatal(th) => {
        semaphore.release()
        throw th
      }
    }

    started.andThen {
      case _ => semaphore.release()
    }(ExecutionContext.Implicits.global)
  }
}

val rateLimiter = new RateLimiter(new Semaphore(20))
val tasks = (1 to 100)
val futures: Seq[Future[Int]] = tasks.map(i => rateLimiter.runBlocking(Future{
    i * 2
  }(ExecutionContext.Implicits.global)))
futures.foreach(f => Await.result(f, Duration.Inf))

它不完美，因为它在两个地方（在信号量和等待中等）中阻塞，并将所有未来保留在内存中（可以避免）。

但它适用于制作：）

Answer 2

以下是Akka Streams的一个简单示例，用于为您的用例建模。

让我们将处理定义为采用String并返回Future[String]的方法：

def process(id: String): Future[String] = ???

然后我们从Source 500,000个List元素中创建String，并使用mapAsync将元素提供给处理方法。并行度设置为20，这意味着任何时间点都不会超过20 Future s。完成每个Future后，我们会执行其他处理并打印结果：

Source((1 to 500000).map(_.toString).toList)
  .mapAsync(parallelism = 20)(process)
  // do something with the result of the Future; here we create a new string
  //   that begins with "Processed: "
  .map(s => s"Processed: $s")
  .runForeach(println)

您可以在documentation中详细了解mapAsync。

等待Scala Future完成并继续下一个

2 个答案: