Question

是否可以保存网页的Scrapy响应屏幕截图，即

scrapy shell "https://google.com" 
view(response)

我知道我可以将输出另存为HTML并在以后查看，但是有办法将输出另存为图像吗？

我检查了这个问题Scrapy Splash Screenshots?，（最相关），但是我明白了

png_bytes = base64.b64decode(response.data['png'])
Traceback (most recent call last):
  File "/usr/lib/python3.6/code.py", line 91, in runcode
    exec(code, self.locals)
  File "<console>", line 1, in <module>
AttributeError: 'HtmlResponse' object has no attribute 'data'

我认为此错误是因为在问题中他使用了Splash Request，在我的情况下是正常的Request

Answer 1

您是指网页的屏幕截图还是命令行中的输出？

您提供的另一个问题似乎是关于正在抓取的页面上的屏幕截图。

Scrapy文档中有关于https://docs.scrapy.org/en/latest/topics/item-pipeline.html?highlight=screenshot#take-screenshot-of-item的一些信息，我不确定是否需要启动

Answer 2

飞溅是最常见的方法。

大致了解一下Splash之后，请参见https://splash.readthedocs.io/en/stable/api.html#render-png。

保存屏幕抓图响应

2 个答案: