难以使用scrapy抓取网站。这应该很简单,但我无法使其发挥作用:
import scrapy
class QuotesSpider(scrapy.Spider):
name = "bookstore"
start_urls = [
'https://example.com/materias/?novedades=LC&p',
]
def parse(self, response):
for quote in response.css('div#results'):
yield {
'book_url': quote.css('ul.resultList li div.content h4 a::attr(href)').extract(),
}
next_page = response.css('div#paginat ul li.next a::attr(href)').extract_first()
if next_page is not None:
next_page = response.urljoin(next_page)
yield scrapy.Request(next_page, callback=self.parse)
我用
运行它scrapy crawl bookstore -o bookstore.json
它搜索网站,但不会将结果保存在json文件中。
知道为什么会这样吗?