Question

我正在使用scrapy，想获取存储在列表中的所有标签（带有文本），然后遍历此列表以逐个div获取我想要的div。

在我的代码下面，我将所有想要的标签存储在div中，这没关系，但是之后的循环不起作用。错误：（“ str”对象没有属性“ css”）

def parse_0(self, response):
    divs = response.css('div.resultList.mB15.hiddenOverflow.listing').extract()

    for div in divs:
        yield {
            'prix': str(div.css('div.fieldPrice ::text').extract_first()).replace("\\xa0", "").replace("\u20ac", ""),
            'lien': div.xpath('.//a/@href').extract_first(),
            'date_scrap': time.strftime("%d/%m/%Y"),
        }

此处是代表嵌入式div的图像： enter image description here 谢谢

Answer 1

不要在选择器上呼叫extract()。 extract()返回字符串。

Answer 2

divs = response.css('div.resultList.mB15.hiddenOverflow.listing').extract() 在这里，当您使用extract时，它将返回转换为字符串的选择器列表。如果您想继续使用extract()，则将div转换为selector 否则，您可以跳过extract（）并且代码应该可以正常运行。

谢谢。

scrapy，循环遍历div列表

2 个答案: