我知道有几篇关于此的帖子,但根据我发现的那些,这应该可以正常工作。
我想在循环中发出一个http请求,我不希望循环迭代,直到请求回调被触发。我正在使用异步库:
const async = require("async");
const request = require("request");
let data = [
"Larry",
"Curly",
"Moe"
];
async.forEachOf(data, (result, idx, callback) => {
console.log("Loop iterated", idx);
let fullUri = "https://jsonplaceholder.typicode.com/posts";
request({
url: fullUri
},
(err, res, body) => {
console.log("Request callback fired...");
if (err || res.statusCode !== 200) return callback(err);
console.log(result);
callback();
});
});
我看到的是:
Loop iterated 0
Loop iterated 1
Loop iterated 2
Request callback fired...
Curly
Request callback fired...
Larry
Request callback fired...
Moe
我需要看到的是:
Loop iterated 0
Request callback fired...
Curly
Loop iterated 1
Request callback fired...
Larry
Loop iterated 2
Request callback fired...
Moe
另外,如果有一种内置的方法来做同样的事情(async / await?Promise?)并且可以删除异步库,那就更好了。
我已经看到了一些聪明的递归示例,但是当我把它用于更复杂的情况时(例如每个循环多个请求调用等)我觉得这种方法很难跟随,并不是可读的。
答案 0 :(得分:3)
您可以完全放弃async
并轻松转到async/await
。
async/await
只需将request
转换为Promise
,即可await
。
最好还是使用已使用本机Promises包装请求的request-promise-native。
从那以后,这是一个async/await
的扣篮:
const rp = require('request-promise-native')
const users = [1, 2, 3, 4]
const results = []
for (const idUser of users) {
const result = await rp('http://foo.com/users/' + idUser)
results.push(result)
}
现在,上述解决方案的问题在于它很慢 - 请求以串行方式运行。这在大多数时候并不理想。
如果您不需要上一次请求的结果,请继续执行Promise.all
以解除并行请求。
const users = [1, 2, 3, 4]
const pendingPromises = []
for (const idUser of users) {
// Here we won't `await` on *each and every* request.
// We'll just prepare it and push it into an Array
pendingPromises.push(rp('http://foo.com/users/' + idUser))
}
// Then we `await` on a a `Promise.all` of those requests
// which will fire all the prepared promises *simultaneously*,
// and resolve when all have been completed
const results = await Promise.all(pendingPromises)
async/await
中的错误处理由普通的try..catch
块提供,为简洁起见,我省略了这些块。
答案 1 :(得分:0)
如果要处理多个(数千个)网址,最好定义批处理大小,并递归调用处理函数来处理一个批处理。
最好限制活动连接的数量,您可以使用this在一定时间内(仅每秒5次)限制活动连接或连接。
最后但并非最不重要;如果您使用Promise.all
,您希望确保当一个承诺拒绝时,并非所有成功都会丢失。您可以捕获被拒绝的请求并返回Fail
类型对象,然后它将使用此失败类型解析。
代码看起来像这样:
const async = require("async");
//lib comes from: https://github.com/amsterdamharu/lib/blob/master/src/index.js
const lib = require("lib");
const request = require("request");
const Fail = function(reason){this.reason=reason;};
const isFail = o=>(o&&o.constructor)===Fail;
const requestAsPromise = fullUri =>
new Promise(
(resolve,reject)=>
request({
url: fullUri
},
(err, res, body) => {
console.log("Request callback fired...");
if (err || res.statusCode !== 200) reject(err);
console.log("Success:",fullUri);
resolve([res,body]);
})
)
const process =
handleBatchResult =>
batchSize =>
maxFunction =>
urls =>
Promise.all(
urls.slice(0,batchSize)
.map(
url=>
maxFunction(requestAsPromise)(url)
.catch(err=>new Fail([err,url]))//catch reject and resolve with fail object
)
)
.then(handleBatch)
.catch(panic=>console.error(panic))
.then(//recursively call itself with next batch
_=>
process(handleBatchResult)(batchSize)(maxFunction)(urls.slice(batchSize))
);
const handleBatch = results =>{//this will handle results of a batch
//maybe write successes to file but certainly write failed
// you can retry later
const successes = results.filter(result=>!isFail(result));
//failed are the requests that failed
const failed = results.filter(isFail);
//To get the failed urls you can do
const failedUrls = failed.map(([error,url])=>url);
};
const per_batch_1000_max_10_active =
process (handleBatch) (1000) (lib.throttle(10));
//start the process
per_batch_1000_max_10_active(largeArrayOfUrls)
.then(
result=>console.log("Process done")
,err=>console.error("This should not happen:".err)
);
在您的handleBatchResult
中,您可以将失败的请求存储到文件中以便稍后再试const [error,uri] = failedResultItem;
,如果大量请求失败,您应该放弃。
在handleBatchResult
之后有一个.catch
,这是你的恐慌模式,它应该不会失败,所以我建议pipe errors to a file(linux)。