Started with python and scraping a week ago, so please go easy people. So the
def parse()
has a request which calls parseCa1L1()
and parseCatL1()
has a request which calls detailCollector()
. The yield request in the parseCatL1()
is not taking it to detailCollector()
.
def parse(self, response):
self.log('url entered is :' + response.url)
item = BasicItem()
for mainCat in response.css('ul.mm-listview > li'):
item['title'] = mainCat.css('a > span.meditle1::text').extract_first()
item['url'] = mainCat.css('a::attr(href)').extract_first()
item['url6'] = item['url']
request = scrapy.Request(item['url'], callback=self.parseCatL1)
request.meta['item'] = item['url6']
request.meta['item'] = item
yield request
def parseCatL1(self, response):
self.log('l1 url entered is :' + response.url)
item = response.meta['item']
item['listing'] = response.css('ul.mm-listview > li')
if item['listing'] == []:
print 'l1 if entered'
item['url6'] = item['url']
request = scrapy.Request(item['url6'], callback=self.detailCollector)
request.meta['item'] = item['url6']
print request
yield request
def detailCollector(self, response):
print 'detail collector entered'
and print request in parseCatL1()
prints <GET https://www.some.com>
and if it is yielding request then why is it not printing detail collector entered
Below is the logs:
DEBUG: Crawled (200) https://www.justdial.com/Bangalore/AC-Compressor-Dealers/nct-10002128> (referer: https://www.justdial.com/Bangalore/311/11060105_3/AC-Compressor-Dealers_b2c)
DEBUG: l1 url entered is :https://www.justdial.com/Bangalore/AC-Compressor-Dealers/nct-10002128 l1 if entered https://www.justdial.com/Bangalore/AC-Compressor-Dealers-Tecumseh/nct-10002136 https://www.justdial.com/Bangalore/AC-Compressor-Dealers-Tecumseh/nct-10002136>
DEBUG: Filtered duplicate request: https://www.justdial.com/Bangalore/AC-Compressor-Dealers-Tecumseh/nct-10002136> - no more duplicates will be shown (see DUPEFILTER_DEBUG to show all duplicates)
DEBUG: Crawled (200) https://www.justdial.com/Bangalore/AC-Compressor-Dealers-Daikin/nct-10002130> (referer: https://www.justdial.com/Bangalore/311/11060105_3/AC-Compressor-Dealers_b2c)
DEBUG: l1 url entered is :https://www.justdial.com/Bangalore/AC-Compressor-Dealers-Daikin/nct-10002130 l1 if entered