Chromeless - 从网页获取所有图像src

时间:2018-06-11 09:02:20

标签: image html-parsing chromeless

我正在尝试使用Chromeless在HTML页面中获取所有img标记的src值。我目前的实现是这样的:

async function run() {
    const chromeless = new Chromeless();
    let url = 'http://someurl/somepath.html';

    var allImgUrls = await chromeless
        .goto(url)
        .evaluate(() => document.getElementsByTagName('img'));

    var htmlContent = await chromeless
        .goto(url)
        .evaluate(() => document.documentElement.outerHTML );

    console.log(allImgUrls);

    await chromeless.end()
}

问题是,我在allImgUrls中没有得到任何img对象的值。

1 个答案:

答案 0 :(得分:0)

经过一番研究,发现我们可以使用这种方法:

var imgSrcs = await chromeless
        .goto(url)
        .evaluate(() => {
            /// since document.querySelectorAll doesn't actually return an array but a Nodelist (similar to array)
            /// we call the map function from Array.prototype which is equivalent to [].map.call()
            const srcs = [].map.call(document.querySelectorAll('img'), img => img.src);
            return JSON.stringify(srcs);
        });