ScrapySplash javascript页面无法渲染

时间:2019-11-09 23:15:01

标签: python scrapy-splash

我正在自学python,遇到了一个我自己无法解决的有趣问题,所以我问天才。

我正在尝试使用Scrapy-Splash在JavaScript中呈现此网页https://apps.gsccca.org/login.asp。我保存的.html文件不是用javascript呈现的,但.png文件是。我还希望呈现.html文件。

import base64

import scrapy
from scrapy_splash import SplashRequest

class TestSpider(scrapy.Spider):
    name = "TestSpider 1"

    def start_requests(self):
        url = 'https://apps.gsccca.org/login.asp'
        splash_args = {
                'wait': 0.5,
                'html': 1,
                'png': 1,
                'width': 600,
                'render_all': 1,
        }
        yield SplashRequest(url=url, callback=self.save_page, endpoint='render.json', args=splash_args)

    def save_page(self, response):
        filename = 'html_page.html'
        with open(filename, 'wb') as f:
            f.write(response.body)

        png_bytes = base64.b64decode(response.data['png'])
        filename = 'some_image.png'
        with open(filename, 'wb') as f:
            f.write(png_bytes)

我希望保存的html_page.html应该是JavaScript呈现的页面。

0 个答案:

没有答案