我对新手很陌生,并逐步构建了我的第一个蜘蛛,我试图将不同的url传递给start_urls,然后我想到了将它们添加到列表中,然后遍历该列表到start_urls ,这个问题是我执行时仅接受列表的网址,然后停止。
数据可以正确返回它们,但只能返回其中一个url,它不能构成完整的循环。 我究竟做错了什么?。 谢谢
class alquilerVehiculo1(CrawlSpider):
plantilla = ("https://www.rentalcars.com/SearchResults.do?country=Argentina&doYear={año_devolucion}&doFiltering=true"
"&fromLocChoose=true&filterTo=49&dropLocationName={localidad}&ftsType=C&ftsLocationSearch={codigoLocalidad}"
"&dropFtsSearch=L&doDay={dia_devolucion}&searchType=allareasgeosearch&filterFrom=0&puMonth={mes_solicitud}&dropFtsInput={localidad}&dropCountry=Argentina"
"&puDay={dia_solicitud}&dropFtsLocationSearch={codigoLocalidad}&puHour=10&dropFtsEntry=22776&enabler=&distance=10"
"&ftsEntry=22776&city={localidad}&driverage=on&filterName=CarCategorisationSupplierFilter&dropCity={localidad}"
"&dropFtsType=C&ftsAutocomplete={localidad}+Argentina&driversAge=30&dropFtsAutocomplete={localidad}+Argentina"
"&dropFtsLocationName={localidad}&dropCountryCode=&doMinute=0&countryCode=&puYear={año_solicitud}&locationName=&puMinute=0&ftsInput={localidad}"
"&coordinates={cordenadas}&dropLocation={codigoLocalidad}&doHour=10&dropCoordinates={cordenadas}"
"&ftsLocationName={localidad}&ftsSearch=L&location={codigoLocalidad}&doMonth={mes_devolucion}&reducedCategory=medium&filterAdditionalInfo=&advSearch=&exSuppliers=&ordering=price")
casos = [{"localidad":"Salta",
"codigoLocalidad": "161",
"cordenadas":"-24.7833%2C-65.4167"},
{"localidad":"Mendoza",
"codigoLocalidad": "106",
"cordenadas":"-32.889%2c-68.843"}]
dias_Semana = date.today() + timedelta(7)
dt_3 = dias_Semana + timedelta(3)
for datos in casos:
datos.update({"año_devolucion": dt_3.year,"dia_devolucion": dt_3.day,"mes_solicitud":dias_Semana.month ,"dia_solicitud": dias_Semana.day,"año_solicitud":dias_Semana.year,"mes_devolucion":dt_3.month})
#print(plantilla.format(**datos))
name = 'alquilerVehiculoMediano'
start_urls = [plantilla.format(**datos)]
def parse(self,response):
for folow_url in response.css("a.show-cars-link::attr(href)").extract():
url = response.urljoin(folow_url)
yield Request(url,callback = self.populate_item)
# yield self.paginate(response)
def populate_item(self,response):
item_loader = ItemLoader(item=ReporteinmobiliarioItem(),response=response)
item_loader.default_input_procesor = MapCompose(remove_tags)
item_loader.add_css('compania', 'div.carResultRow_OfferInfo_Supplier-wrap>h4::text')
item_loader.add_css('valor','span[class="carResultRow_Price-now"]::text') #'span.carResultRow_Price-now::text')
item_loader.add_css('dias', 'span.carResultRow_Price-duration::text')
item_loader.add_value('tipoVehiculo','Coche Mediano')
item_loader.add_css('modelo','td.carResultRow_CarSpec>h2::text')
item_loader.add_css('recogida_devolucion','div.search-summary__location::text')
yield item_loader.load_item()
答案 0 :(得分:0)
您的代码在每个循环中都覆盖变量:
for datos in casos:
start_urls = [plantilla.format(**datos)]
^^^^^
应该是:
start_urls = []
for datos in casos:
start_urls.append(plantilla.format(**datos))