如何配置JSDOM完全呈现React页面(例如Reddit)?

时间:2019-05-28 02:57:05

标签: javascript node.js web-scraping jsdom

我正在尝试使用JSDOM渲染Reddit页面,但是它没有执行填充HTML其余部分的js。所以我只得到页面的一小部分。我的目标是能够访问线程中的每个注释。

我在配置对象中设置了runScripts:“危险地”,并将url:“ https://www.reddit.com”设置为无效。

const jsdom = require("jsdom");
const axios = require('axios');
const { JSDOM } = jsdom;
const config = {
    headers: {
        "Host": "www.reddit.com",
        "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:67.0) Gecko/20100101 Firefox/67.0",
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
        "Accept-Language": "en-US,en;q=0.5",
        "Accept-Encoding": "gzip, deflate, br",
        "Connection": "keep-alive",
        "Upgrade-Insecure-Requests": 1,
        "Cache-Control": "max-age=0"
    }
}

axios('https://www.reddit.com/r/learnprogramming/comments/bqnhyu/another_self_taught_success_story_i_just_landed/', config)
    .then((res) => {
        const dom = new JSDOM(res.data, { runScripts: "dangerously", pretendToBeVisual: true, url: "https://www.reddit.com"});
        setTimeout(() => {
            const p_tags = dom.window.document.getElementsByClassName('himKiy');
            for (let tag of p_tags) {
                console.log(tag.innerHTML);
            }
        }, 3000);
    });

0 个答案:

没有答案