我正在自学python,遇到了一个我自己无法解决的有趣问题,所以我问天才。
我正在尝试使用Scrapy-Splash在JavaScript中呈现此网页https://apps.gsccca.org/login.asp。我保存的.html文件不是用javascript呈现的,但.png文件是。我还希望呈现.html文件。
import base64
import scrapy
from scrapy_splash import SplashRequest
class TestSpider(scrapy.Spider):
name = "TestSpider 1"
def start_requests(self):
url = 'https://apps.gsccca.org/login.asp'
splash_args = {
'wait': 0.5,
'html': 1,
'png': 1,
'width': 600,
'render_all': 1,
}
yield SplashRequest(url=url, callback=self.save_page, endpoint='render.json', args=splash_args)
def save_page(self, response):
filename = 'html_page.html'
with open(filename, 'wb') as f:
f.write(response.body)
png_bytes = base64.b64decode(response.data['png'])
filename = 'some_image.png'
with open(filename, 'wb') as f:
f.write(png_bytes)
我希望保存的html_page.html应该是JavaScript呈现的页面。