Question

我有一个如下所示的脚本，它扫描网页上的某些文字，如果在页面上找到它，它会通知我。

我目前遇到的唯一问题是，要查看链接上的内容，需要进行身份验证。我在网页上有一个帐户，但我不确定如何将它与Node.js一起使用。

以下是该网站的链接：http://www.roblox.com/Trade/inventoryhandler.ashx?filter=0&userid=261&page=1&itemsPerPage=14 - 它会显示为空，但在登录Roblox.com后会显示此内容：http://prntscr.com/98u83j

这是当前的脚本：

// Import the scraping libraries
var request = require("request");
var cheerio = require("cheerio");

// Array for the user IDs which match the query
var matches = [];

// Do this for all possible users
function makeRequest(i){
    var location = "http://www.roblox.com/Trade/inventoryhandler.ashx?filter=0&userid=" + i + "&page=1&itemsPerPage=14";

    request(location, function (error, response, body) {

        console.log('request made for id '+ i);
        if (!error) {

            // Load the website content
            var $ = cheerio.load(body);
            var bodyText = $("body").text();

            // Search the website content for bluesteel
            if (bodyText.indexOf("bluesteel") > -1) {

                console.log("Found bluesteel in inventory of user ", i);
                // Save the user ID, if bluesteel was found
                matches.push(i);
            }

        // Something goes wrong
        } else {

            console.log(error.message);
        }

        if(i==33){
            console.log("All users with bluesteel in inventory: ", matches);

            return;
        }

        makeRequest(i+1); 
    });
}

如果您能帮助我在Node.js脚本中使用我的Roblox.com身份验证（用户名和密码），我将不胜感激。

Answer 1

基本上，您必须复制他们的登录请求，这可能是也可能是不可能的（从未听说过该网站）。一个很好的起点是在您使用Chrome中的开发人员工具登录网站时观察网络流量。然后，尝试从高级REST客户端（Chrome扩展程序）等复制它。如果有效，请从您的代码中提出请求。

Node.js - 需要身份验证的Scrape网页

1 个答案: