如何将来自for循环的单独的json对象输出加入数组?

时间:2019-05-20 22:34:12

标签: javascript json loops casperjs

我正在使用CasperJS抓取网站,其中一项任务涉及在for循环计数器设置的URL上进行爬网。网址看起来像这样

www.example.com/page/no=

其中,no是for循环计数器设置的0到10之间的任何数字。然后,抓取器遍历所有页面,将数据抓取到JSON对象中,并重复直到no = 10。

我要获取的数据存储在每个页面的离散组中-我想通过加入每个页面的所有已抓取输出来使用的是单个JSON对象。

想象一下,第1页有费用1,而我得到的对象是{Expense1},第2页有费用2,而我正在得到的对象是{Expense2}。我希望在抓取末尾有一个JSON,如下所示:

    scrapedData = {
       "expense1": expense1,
       "expense2": expense2,
     }

我遇到的麻烦是将所有JSON对象连接到一个数组中。

我初始化了一个空数组,然后将每个对象推送到数组。 我试过检查如果for循环中的迭代器i等于10,那么将打印出JSON对象,但这似乎没有用。我抬起头,似乎可以使用“对象传播”,但是在这种情况下,我不确定如何使用它。

任何指针都会有所帮助。我应该使用任何数组函数(例如map)吗?

casper.then(function(){    

   var url = "https:example.net/secure/SaFinShow?url=";    
    //We create a for loop to go open the urls

    for (i=0; i<11; i++){

      this.thenOpen(url+ i, function(response){

          expense_amount = this.fetchText("td[headers='amount']");

          Date = this.fetchText("td[headers='Date']");

          Location = this.fetchText("td[headers='zipcode']");

          id = this.fetchText("td[headers='id']");


          singleExpense = {

              "Expense_Amount": expense_amount,
              "Date": Date,
              "Location": Location,
              "id": id
            };

          if (i ===10){
            expenseArray.push(JSON.stringify(singleExpense, null, 2))
            this.echo(expenseArray);
          }
      });

    };
});

1 个答案:

答案 0 :(得分:0)

以您的示例并对其进行扩展,您应该能够执行以下操作:

// Initialize empty object to hold all of the expenses
var scrapedData = {};

casper.then(function(){    

   var url = "https:example.net/secure/SaFinShow?url=";    
    //We create a for loop to go open the urls

    for (i=0; i<11; i++){

      this.thenOpen(url+ i, function(response){

          expense_amount = this.fetchText("td[headers='amount']");

          Date = this.fetchText("td[headers='Date']");

          Location = this.fetchText("td[headers='zipcode']");

          id = this.fetchText("td[headers='id']");


          singleExpense = {

              "Expense_Amount": expense_amount,
              "Date": Date,
              "Location": Location,
              "id": id
            };
          // As we loop over each of the expenses add them to the object containing all of them
          scrapedData['expense'+i] = singleExpense;
      });

    };
});

此变量运行后,scrapedData的形式应为:

scrapedData = {
  "expense1": expense1,
  "expense2": expense2
}

更新代码

上述代码的一个问题是,在for循环中,当您循环支出时,变量应该是局部的。变量名也不应为DateLocation,因为它们是JavaScript中的内置名称。

// Initialize empty object to hold all of the expenses
var scrapedData = {};

casper.then(function(){    

   var url = "https:example.net/secure/SaFinShow?url=";    
    //We create a for loop to go open the urls

    for (i=0; i<11; i++){

      this.thenOpen(url+ i, function(response){
          // Create our local variables to store data for this particular
          // expense data
          var expense_amount = this.fetchText("td[headers='amount']");

          // Don't use `Date` it is a JS built-in name
          var date = this.fetchText("td[headers='Date']");
          // Don't use `Location` it is a JS built-in name
          var location = this.fetchText("td[headers='zipcode']");

          var id = this.fetchText("td[headers='id']");


          singleExpense = {

              "Expense_Amount": expense_amount,
              "Date": date,
              "Location": location,
              "id": id
            };

          // As we loop over each of the expenses add them to the object containing all of them
          scrapedData['expense'+i] = singleExpense;
      });

    };
});