异步并行HTTP请求

时间:2015-08-01 12:04:59

标签: node.js asynchronous

我在加载大量网址的应用程序时出现控制流问题。我正在使用Caolan Async和NPM请求模块。

我的问题是,只要将函数添加到队列,HTTP响应就会启动。理想情况下,我想构建我的队列,并且只在队列启动时才开始发出HTTP请求。否则回调会在队列启动之前开始触发 - 导致队列过早完成。

var request = require('request') // https://www.npmjs.com/package/request
    , async = require('async'); // https://www.npmjs.com/package/async

var myLoaderQueue = []; // passed to async.parallel
var myUrls = ['http://...', 'http://...', 'http://...'] // 1000+ urls here

for(var i = 0; i < myUrls.length; i++){
    myLoaderQueue.push(function(callback){

        // Async http request
        request(myUrls[i], function(error, response, html) {

            // Some processing is happening here before the callback is invoked
            callback(error, html);
        });
    });
}

// The loader queue has been made, now start to process the queue
async.parallel(queue, function(err, results){
    // Done
});

有没有更好的方法来攻击它?

3 个答案:

答案 0 :(得分:24)

使用for循环结合异步调用是有问题的(使用ES5)并可能产生意外结果(在您的情况下,检索到错误的URL)。

相反,请考虑使用async.map()

async.map(myUrls, function(url, callback) {
  request(url, function(error, response, html) {
    // Some processing is happening here before the callback is invoked
    callback(error, html);
  });
}, function(err, results) {
  ...
});

鉴于您要检索1000多个网址,async.mapLimit()也可能值得考虑。

答案 1 :(得分:7)

如果您愿意开始使用BluebirdBabel来使用promisesES7 async / await,您可以执行以下操作:

let Promise = require('bluebird');
let request = Promise.promisify(require('request'));

let myUrls = ['http://...', 'http://...', 'http://...'] // 1000+ urls here

async function load() {
  try {
    // map myUrls array into array of request promises
    // wait until all request promises in the array resolve
    let results = await Promise.all(myUrls.map(request));
    // don't know if Babel await supports syntax below
    // let results = await* myUrls.map(request));
    // print array of results or use forEach 
    // to process / collect them in any other way
    console.log(results)
  } catch (e) {
    console.log(e);
  }
}

答案 2 :(得分:0)

我非常有信心您遇到了不同错误的结果。当你的排队函数正在评估时,我已经被重新定义,这可能会导致它出现就像你错过了第一个URL。在排队功能时尝试一点关闭。

var request = require('request') // https://www.npmjs.com/package/request
    , async = require('async'); // https://www.npmjs.com/package/async

var myLoaderQueue = []; // passed to async.parallel
var myUrls = ['http://...', 'http://...', 'http://...'] // 1000+ urls here

for(var i = 0; i < myUrls.length; i++){
    (function(URLIndex){
       myLoaderQueue.push(function(callback){

           // Async http request
           request(myUrls[URLIndex], function(error, response, html) {

               // Some processing is happening here before the callback is invoked
               callback(error, html);
           });
       });
    })(i);
}

// The loader queue has been made, now start to process the queue
async.parallel(queue, function(err, results){
    // Done
});