在这个程序中,我试图获取渥太华的所有租金价格,但它每次只返回一个随机价格,为什么?
import scrapy
class RentalPricesSpider(scrapy.Spider):
name = 'rental_prices'
allowed_domains = ['www.kijiji.ca']
start_urls = ['https://www.kijiji.ca/b-real-estate/ottawa/c34l1700185']
def parse(self, response):
rental_price = response.xpath('normalize-space(//div[@class="price"]/text())').getall()
yield {
'rent': rental_price,
}
答案 0 :(得分:0)
您选择了错误的 xpath,因为您没有获得预期的输出。使用 css 选择器 div.price::text
代替 xpath。
import scrapy
class RentalPricesSpider(scrapy.Spider):
name = 'rental_prices'
allowed_domains = ['www.kijiji.ca']
start_urls = ['https://www.kijiji.ca/b-real-estate/ottawa/c34l1700185']
def parse(self, response):
rental_price = response.css('div.price::text').getall()
rental_price = [x.strip() for x in rental_price if x.strip()]
# rental_price = list(map(str.strip ,x) for x in rental_price)
yield {
'rent': rental_price,
}
process = CrawlerProcess(settings={
"USER_AGENT" : "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.93 Safari/537.36",
"FEEDS": {
"items.json": {"format": "json"},
},
})
process.crawl(RentalPricesSpider)
process.start()