Question

我在Scala中使用Akka actor从外部服务（HTTP get请求）下载资源。外部服务的响应是JSON，我必须使用分页（提供程序非常慢）。我想在10个线程中同时下载所有分页结果。我使用这样的URL下载chunk：http://service.com/itmes?limit=50&offset=1000

我创建了以下管道：

ScatterActor => RoundRobinPool[10](LoadChunkActor) => Aggreator

ScatterActor获取要下载的项目的总数，并将其划分为块。我创建了10个LoadChunkActor来同时处理任务。

  override def receive: Receive = {
    case LoadMessage(limit) =>
    val offsets: IndexedSeq[Int] = 0 until limit by chunkSize
    offsets.foreach(offset => context.system.actorSelection(pipe) !
    LoadMessage(chunkSize, offset))
 }

LoadChunkActor使用Spray发送请求。演员看起来像这样：

val pipeline = sendReceive ~> unmarshal[List[Items]]
override def receive: Receive = {
  case LoadMessage(limit, offset) =>
    val uri: String = s"http://service.com/items?limit=50&offset=$offset"
    val responseFuture = pipeline {Get(uri)}
    responseFuture onComplete {
      case Success(items) => aggregator ! Loaded(items)
    }
 }

正如您所看到的，LoadChunkActor正在从外部服务请求块并添加要在onComplete上运行的回调。演员现在已准备好接收另一条消息，他正在请求另一个大块。 Spray使用非阻塞API来下载块。结果外部服务充斥着我的请求，我得到了超时。

如何安排任务列表，但我想同时处理最多10个任务？

Answer 1

我创建了以下解决方案（类似于拉http://www.michaelpollmeier.com/akka-work-pulling-pattern/：

ScatterActor (10000x messages) => 
  ThrottleActor => LoadChunkActor => ThrottleMonitorActor => Aggregator
         ^                                    |
         |<--------WorkDoneMessage------------|

ThrottleActor将消息发送到ListBuffer并发送到LoadChunkActor最多N个消息计数。
当LoadChunkActor通过ThrottleMonitorActor向Aggregator发送消息时。
ThrottleMonitorActor向ThrottleActor发送确认信息。
ThrottleActor向LoadChunkActor发送下一条消息。

Answer 2

从project adhoclabs/akka-http-contrib，您现在（2016年7月，两年后）scala.co.adhoclabs.akka.http.contrib.throttle package来自Yeghishe Piruzyan。

请参阅＆＃34; Akka Http Request Throttling＆＃34;

implicit val throttleSettings = MetricThrottleSettings.fromConfig

Http().bindAndHandle(
  throttle.apply(routes),
  httpInterface,
  httpPort
)

在Akka / Spray上阻止HTTP请求

2 个答案: