Question

我正在抓取该网站：https://www.mrlodge.de/wohnungen/ 因此，我编写了一个搜寻器，该搜寻器非常有用，并且可以从每个报价中提取所有信息。每个报价还具有详细信息。我可以通过以下xpath表达式返回的url到达此详细信息：

x = listings.xpath('//form/input[@name="name_url"]/@value').getall()

我面临的问题是它位于offer块内，而不位于我已经循环的div标签内。我尝试了以下操作，但仅获得detail_url的第一个元素。必须有一种将其包含在for循环中的方法，但我只是不知道如何。

请帮助

def parse(self, response):
    json_response = json.loads(response.body)
    listings = Selector(text=json_response.get('list'))
    x = listings.xpath('//form/input[@name="name_url"]/@value').get()

    for listing in listings.xpath("//div[@class='mrlobject-list__item__content']"):
        yield {
            'title': listing.xpath('.//div[@class="obj-smallinfo"]/text()').get(),
            'rent': listing.xpath(".//span[@class='obj-rent']/text()").get(),
            'room': listing.xpath(".//span[@class='obj-room']/text()").get(),
            'area': listing.xpath(".//span[@class='obj-area']/text()").get(),
            'info': listing.xpath(".//div[@class='object-title']/text()").get(),
            'detail_url': x
        }

Scrapy x路径：在两个不同的块上循环并合并结果

0 个答案: