Question

我知道有几篇关于此的帖子，但根据我发现的那些，这应该可以正常工作。

我想在循环中发出一个http请求，我不希望循环迭代，直到请求回调被触发。我正在使用异步库：

const async = require("async");
const request = require("request");

let data = [
    "Larry",
    "Curly",
    "Moe"
];

async.forEachOf(data, (result, idx, callback) => {
    console.log("Loop iterated", idx);
    let fullUri = "https://jsonplaceholder.typicode.com/posts";
    request({
        url: fullUri
    }, 
    (err, res, body) => {
        console.log("Request callback fired...");
        if (err || res.statusCode !== 200) return callback(err);
        console.log(result);
        callback();
    });
});

我看到的是：

Loop iterated 0
Loop iterated 1
Loop iterated 2
Request callback fired...
Curly
Request callback fired...
Larry
Request callback fired...
Moe

我需要看到的是：

Loop iterated 0
Request callback fired...
Curly
Loop iterated 1
Request callback fired...
Larry
Loop iterated 2
Request callback fired...
Moe

另外，如果有一种内置的方法来做同样的事情（async / await？Promise？）并且可以删除异步库，那就更好了。

我已经看到了一些聪明的递归示例，但是当我把它用于更复杂的情况时（例如每个循环多个请求调用等）我觉得这种方法很难跟随，并不是可读的。

Answer 1

您可以完全放弃async并轻松转到async/await。

宣传您的请求并使用`async/await`

只需将request转换为Promise，即可await。

最好还是使用已使用本机Promises包装请求的request-promise-native。

序列示例

从那以后，这是一个async/await的扣篮：

const rp = require('request-promise-native')

const users = [1, 2, 3, 4]
const results = []

for (const idUser of users) {
  const result = await rp('http://foo.com/users/' + idUser)

  results.push(result)
}

并行示例

现在，上述解决方案的问题在于它很慢 - 请求以串行方式运行。这在大多数时候并不理想。

如果您不需要上一次请求的结果，请继续执行Promise.all以解除并行请求。

const users = [1, 2, 3, 4]

const pendingPromises = []
for (const idUser of users) {
  // Here we won't `await` on *each and every* request.
  // We'll just prepare it and push it into an Array
  pendingPromises.push(rp('http://foo.com/users/' + idUser))
}

// Then we `await` on a a `Promise.all` of those requests
// which will fire all the prepared promises *simultaneously*, 
// and resolve when all have been completed
const results = await Promise.all(pendingPromises)

错误处理

async/await中的错误处理由普通的try..catch块提供，为简洁起见，我省略了这些块。

Answer 2

如果要处理多个（数千个）网址，最好定义批处理大小，并递归调用处理函数来处理一个批处理。

最好限制活动连接的数量，您可以使用this在一定时间内（仅每秒5次）限制活动连接或连接。

最后但并非最不重要;如果您使用Promise.all，您希望确保当一个承诺拒绝时，并非所有成功都会丢失。您可以捕获被拒绝的请求并返回Fail类型对象，然后它将使用此失败类型解析。

代码看起来像这样：

const async = require("async");
//lib comes from: https://github.com/amsterdamharu/lib/blob/master/src/index.js
const lib = require("lib");
const request = require("request");

const Fail = function(reason){this.reason=reason;};
const isFail = o=>(o&&o.constructor)===Fail;
const requestAsPromise = fullUri =>
  new Promise(
    (resolve,reject)=>
      request({
        url: fullUri
      }, 
      (err, res, body) => {
        console.log("Request callback fired...");
        if (err || res.statusCode !== 200) reject(err);
        console.log("Success:",fullUri);
        resolve([res,body]);
      })
  )
const process = 
  handleBatchResult =>
  batchSize =>
  maxFunction =>
  urls =>
    Promise.all(
      urls.slice(0,batchSize)
      .map(
        url=>
          maxFunction(requestAsPromise)(url)
          .catch(err=>new Fail([err,url]))//catch reject and resolve with fail object
      )
    )
    .then(handleBatch)
    .catch(panic=>console.error(panic))
    .then(//recursively call itself with next batch
      _=>
        process(handleBatchResult)(batchSize)(maxFunction)(urls.slice(batchSize))
    );

const handleBatch = results =>{//this will handle results of a batch
  //maybe write successes to file but certainly write failed
  //  you can retry later     
  const successes = results.filter(result=>!isFail(result));
  //failed are the requests that failed
  const failed = results.filter(isFail);
  //To get the failed urls you can do
  const failedUrls = failed.map(([error,url])=>url);
};

const per_batch_1000_max_10_active = 
  process (handleBatch) (1000) (lib.throttle(10));

//start the process
per_batch_1000_max_10_active(largeArrayOfUrls)
.then(
  result=>console.log("Process done")
  ,err=>console.error("This should not happen:".err)
);

在您的handleBatchResult中，您可以将失败的请求存储到文件中以便稍后再试const [error,uri] = failedResultItem;，如果大量请求失败，您应该放弃。

在handleBatchResult之后有一个.catch，这是你的恐慌模式，它应该不会失败，所以我建议pipe errors to a file（linux）。

节点异步循环 - 如何使此代码按顺序运行？

2 个答案:

宣传您的请求并使用`async/await`

序列示例

并行示例

错误处理

节点异步循环 - 如何使此代码按顺序运行？

2 个答案:

宣传您的请求并使用async/await

序列示例

并行示例

错误处理

宣传您的请求并使用`async/await`