Question

我有一个脚本，它会产生大量的Web请求（~300000）。它看起来像这样

// Setup a new wsClient
val config = new NingAsyncHttpClientConfigBuilder(DefaultWSClientConfig()).build
val builder = new AsyncHttpClientConfig.Builder(config)
val wsClient = new NingWSClient(builder.build)

// Each of these use the wsClient
def getAs: Future[Seq[A]] = { ... }
def getBs: Future[Seq[B]] = { ... }
def getCs: Future[Seq[C]] = { ... }
def getDs: Future[Seq[D]] = { ... }

(for {
    as <- getAs
    bs <- getBs
    cs <- getCs
    ds <- getDs
} yield (as, bs, cs, ds)).map(tuple => println("done"))

问题是我会遇到Too many open files异常，因为每个函数异步生成数千个请求，每个请求都使用文件描述符。

我尝试重新组织我的功能，以便每个人都可以使用自己的客户进行批处理：

def getAs: Future[Seq[A]] = {
    someCollection.group(1000).map(batch => {
        val client = new NingWSClient(builder.build) // Make a new client for every batch
        Future.sequence(batch.map(thing => {
            wsClient.url(...).map(...)
        })).map(things => {
            wsClient.close // Close the client
            things
        })
    })
}

但这导致for-comprehension提前结束（没有任何错误消息或例外）：

(for {
    as <- getAs
    bs <- getBs // This doesn't happen
    cs <- getCs // Or any of the following ones
    ds <- getDs
} yield (as, bs, cs, ds)).map(tuple => println("done"))

我只是在寻找制作大量http请求的正确方法，而无需打开太多文件描述符。

Answer 1

我遇到了类似的问题，对一个Web服务（~500 +）的请求太多了。您的分组代码示例几乎是正确的，但是，您将获得Iterator[Future[List[Int]]]或Future.sequence - d Future[Iterator[List[Int]]]。但是，我认为他们所有将异步运行。您需要首先批量生产，然后flatMap它（等待它完成），然后开始下一批生产。这是我在this answer之后设法编写的内容：

val futureIterator = list.grouped(50).foldLeft(Future.successful[List[Int]](Nil)) {
  (fItems, items) =>
    fItems flatMap { processed =>
      println("PROCESSED: " + processed); println("SPAWNED: " + items);
      Future.traverse(items)(getFuture) map (res => processed ::: res)
    }
}
println(Await.result(futureIterator, Duration.Inf))

希望这有帮助！

Answer 2

您可以使用Octoparts：

https://m3dev.github.io/octoparts/

但是听起来你真的想要扭转这种模式，以便wsClient在外面进行调用，然后你将Future [WSResponse]平铺映射回来。这会将期货数量限制在AsyncHttpClient使用的内部Netty线程池中，您可以更改配置设置以增加或减少netty通道池中的线程数。

使用play.api.libs.ws

2 个答案: