节点异步循环 - 如何使此代码按顺序运行?

时间:2017-12-27 22:51:27

标签: javascript node.js asynchronous

我知道有几篇关于此的帖子,但根据我发现的那些,这应该可以正常工作。

我想在循环中发出一个http请求,我不希望循环迭代,直到请求回调被触发。我正在使用异步库:

const async = require("async");
const request = require("request");

let data = [
    "Larry",
    "Curly",
    "Moe"
];

async.forEachOf(data, (result, idx, callback) => {
    console.log("Loop iterated", idx);
    let fullUri = "https://jsonplaceholder.typicode.com/posts";
    request({
        url: fullUri
    }, 
    (err, res, body) => {
        console.log("Request callback fired...");
        if (err || res.statusCode !== 200) return callback(err);
        console.log(result);
        callback();
    });
});

我看到的是:

Loop iterated 0
Loop iterated 1
Loop iterated 2
Request callback fired...
Curly
Request callback fired...
Larry
Request callback fired...
Moe

我需要看到的是:

Loop iterated 0
Request callback fired...
Curly
Loop iterated 1
Request callback fired...
Larry
Loop iterated 2
Request callback fired...
Moe

另外,如果有一种内置的方法来做同样的事情(async / await?Promise?)并且可以删除异步库,那就更好了。

我已经看到了一些聪明的递归示例,但是当我把它用于更复杂的情况时(例如每个循环多个请求调用等)我觉得这种方法很难跟随,并不是可读的。

2 个答案:

答案 0 :(得分:3)

您可以完全放弃async并轻松转到async/await

宣传您的请求并使用async/await

只需将request转换为Promise,即可await

最好还是使用已使用本机Promises包装请求的request-promise-native

序列示例

从那以后,这是一个async/await的扣篮:

const rp = require('request-promise-native')

const users = [1, 2, 3, 4]
const results = []

for (const idUser of users) {
  const result = await rp('http://foo.com/users/' + idUser)

  results.push(result)
}

并行示例

现在,上述解决方案的问题在于它很慢 - 请求以串行方式运行。这在大多数时候并不理想。

如果您不需要上一次请求的结果,请继续执行Promise.all以解除并行请求。

const users = [1, 2, 3, 4]

const pendingPromises = []
for (const idUser of users) {
  // Here we won't `await` on *each and every* request.
  // We'll just prepare it and push it into an Array
  pendingPromises.push(rp('http://foo.com/users/' + idUser))
}

// Then we `await` on a a `Promise.all` of those requests
// which will fire all the prepared promises *simultaneously*, 
// and resolve when all have been completed
const results = await Promise.all(pendingPromises)

错误处理

async/await中的错误处理由普通的try..catch块提供,为简洁起见,我省略了这些块。

答案 1 :(得分:0)

如果要处理多个(数千个)网址,最好定义批处理大小,并递归调用处理函数来处理一个批处理。

最好限制活动连接的数量,您可以使用this在一定时间内(仅每秒5次)限制活动连接或连接。

最后但并非最不重要;如果您使用Promise.all,您希望确保当一个承诺拒绝时,并非所有成功都会丢失。您可以捕获被拒绝的请求并返回Fail类型对象,然后它将使用此失败类型解析。

代码看起来像这样:

const async = require("async");
//lib comes from: https://github.com/amsterdamharu/lib/blob/master/src/index.js
const lib = require("lib");
const request = require("request");

const Fail = function(reason){this.reason=reason;};
const isFail = o=>(o&&o.constructor)===Fail;
const requestAsPromise = fullUri =>
  new Promise(
    (resolve,reject)=>
      request({
        url: fullUri
      }, 
      (err, res, body) => {
        console.log("Request callback fired...");
        if (err || res.statusCode !== 200) reject(err);
        console.log("Success:",fullUri);
        resolve([res,body]);
      })
  )
const process = 
  handleBatchResult =>
  batchSize =>
  maxFunction =>
  urls =>
    Promise.all(
      urls.slice(0,batchSize)
      .map(
        url=>
          maxFunction(requestAsPromise)(url)
          .catch(err=>new Fail([err,url]))//catch reject and resolve with fail object
      )
    )
    .then(handleBatch)
    .catch(panic=>console.error(panic))
    .then(//recursively call itself with next batch
      _=>
        process(handleBatchResult)(batchSize)(maxFunction)(urls.slice(batchSize))
    );

const handleBatch = results =>{//this will handle results of a batch
  //maybe write successes to file but certainly write failed
  //  you can retry later     
  const successes = results.filter(result=>!isFail(result));
  //failed are the requests that failed
  const failed = results.filter(isFail);
  //To get the failed urls you can do
  const failedUrls = failed.map(([error,url])=>url);
};

const per_batch_1000_max_10_active = 
  process (handleBatch) (1000) (lib.throttle(10));

//start the process
per_batch_1000_max_10_active(largeArrayOfUrls)
.then(
  result=>console.log("Process done")
  ,err=>console.error("This should not happen:".err)
);

在您的handleBatchResult中,您可以将失败的请求存储到文件中以便稍后再试const [error,uri] = failedResultItem;,如果大量请求失败,您应该放弃。

handleBatchResult之后有一个.catch,这是你的恐慌模式,它应该不会失败,所以我建议pipe errors to a file(linux)。