Question

我正在抓取这个网站：http://www.germandeli.com/Meats/Sausages，其中包含一些动态内容。

我正在使用带有splash的scrapy shell来呈现javascript，但它返回空值[]。我的系统是Ubuntu 14.04 LTS。

这里是我使用的代码：

$ scrapy shell 'http://localhost:8050/render.html?url=http://www.germandeli.com/Meats/Sausages'
>>> response.xpath('*//h2[@class="item-cell-name"]/a/@href').extract()

任何提示都将不胜感激！

Answer 1

我明白了。我忘了在链接的末尾添加'＆amp; timeout = 10＆amp; wait = 5'！

scrapy shell 'http://localhost:8050/render.html?url=http://www.germandeli.com/Meats/Sausages&timeout=10&wait=5'

使用带有splash的scrapy shell返回空值

1 个答案: