使用curl :: curl_fetch_multi时,保留有关请求的URL的信息

时间:2018-02-20 14:22:42

标签: r curl

我使用以下代码执行多个同时请求。

urls <- c("https://httpbin.org/status/301", "https://httpbin.org/status/302", "https://httpbin.org/status/200")

result <- list()

p <- curl::new_pool(total_con = 10, host_con = 5, multiplex = T)

cb <- function(res) {
  result <<- append(result, list(res))
  cat("requested URL: ", url, "last URL: ", res$url, "\n\n")
}

for (url in urls) {
  curl::curl_fetch_multi(url, done = cb, handle = curl::new_handle(failonerror = F, nobody = F, followlocation = T, ssl_verifypeer = 0), pool = p)
}

curl::multi_run(pool = p)

如您所见,我想在控制台上打印请求的网址和最终以200 ok回答的网址。

以下内容将打印到控制台:

requested URL:  https://httpbin.org/status/200 last URL:  https://httpbin.org/status/200 

requested URL:  https://httpbin.org/status/200 last URL:  https://httpbin.org/get 

requested URL:  https://httpbin.org/status/200 last URL:  https://httpbin.org/get 

控制台输出中请求的URL始终为https://httpbin.org/status/200,因为它是for循环中使用的最后一个URL。所以,这是错误的做法。

curl_fetch_multi返回后使用multi_run使用时,如何保留有关初始请求网址的信息?这意味着,如果将请求的网址添加到res - 列表,并使用cat("requested URL: ", res$requested_url, "last URL: ", res$url, "\n\n")等内容进行查询,那将是理想的选择。

1 个答案:

答案 0 :(得分:0)

我有一个类似的问题,我想使用curl_fetch_multi进行异步POST请求,并检查哪些请求成功和哪些失败。但是,由于POST语句的结构(所有字段都在请求正文中),响应对象中没有任何标识信息。 我的解决方案是生成带有标识符的自定义回调函数

urls <- c("https://httpbin.org/status/301", "https://httpbin.org/status/302", "https://httpbin.org/status/200")

result <- list()

# create an identifier for each url
url.ids = paste0("request_", seq_along(urls))

# custom callback function generator: generate a unique function for each url
cb = function(id){
  function(res){
    result[[id]] <<- res  
  }
}
# create the list of callback functions
cbfuns = lapply(url.ids, cb)

p <- curl::new_pool(total_con = 10, host_con = 5, multiplex = T)

for (i in seq_along(urls)) {
  curl::curl_fetch_multi(urls[i], done = cbfuns[[i]], handle = curl::new_handle(failonerror = F, nobody = F, followlocation = T, ssl_verifypeer = 0), pool = p)
}

curl::multi_run(pool = p)

在此示例中,自定义回调函数仅用于命名result的元素:

names(result)
## [1] "request_3" "request_1" "request_2"

然后可用于将每个响应绑定回原始请求。