Question

因此，我是新手，我创建了第一只蜘蛛。但是我遇到了类型错误。

这只蜘蛛只是将好书第一页的引文报废。它是30个带有标签和作者姓名的引号。

import scrapy

class Goodreadspider(scrapy.Spider):

    name = 'goodreads'

    def start_requests(self):
        url = ['https://www.goodreads.com/quotes?page=1']
        yield scrapy.Request(url=url, callback=self.parse)
    def parse(self, parse):
        for quote in response.selector.xpath("//div[@class='quote']"):
            yield{
            'text': quote.xpath("//div[@class='quoteText']/text()[1]").extract_first,
            'author': quote.xpath("//div[@class='quoteText']/child::a/text()").extract_first,
            'tags': quote.xpath("//div[@class='greyText smallText left']/a/text()").extract()
            }

Typeerror <'请求的URL必须是str或unicode，得到了％s：'

Answer 1

我认为您有此错误，因为您正试图传递列表，而不是按照“ scrapy.Request”的要求传递str或unicode。

尝试一下：

def start_requests(self):
        url = 'https://www.goodreads.com/quotes?page=1'
        yield scrapy.Request(url=url, callback=self.parse)

应该可以。

Answer 2

您尝试删除[]吗？

url = 'https://www.goodreads.com/quotes?page=1'

如何解决“ Typeerror <'请求网址必须为str或Unicode获得％s：'>

2 个答案: