The yield request is not working for sub function

时间:2017-12-18 05:38:17

标签: python web-scraping scrapy

Started with python and scraping a week ago, so please go easy people. So the def parse() has a request which calls parseCa1L1() and parseCatL1() has a request which calls detailCollector(). The yield request in the parseCatL1() is not taking it to detailCollector().

def parse(self, response):
    self.log('url entered is :' + response.url)
    item = BasicItem()
    for mainCat in response.css('ul.mm-listview > li'):
        item['title'] = mainCat.css('a > span.meditle1::text').extract_first()
        item['url'] = mainCat.css('a::attr(href)').extract_first()
        item['url6'] = item['url']
        request = scrapy.Request(item['url'], callback=self.parseCatL1)
        request.meta['item'] = item['url6']
        request.meta['item'] = item
        yield request  

def parseCatL1(self, response):
    self.log('l1 url entered is :' + response.url)
    item = response.meta['item']
    item['listing'] = response.css('ul.mm-listview > li')
    if item['listing'] == []:
        print 'l1 if entered'
        item['url6'] = item['url']
        request = scrapy.Request(item['url6'], callback=self.detailCollector)
        request.meta['item'] = item['url6']
        print request
        yield request

def detailCollector(self, response):
     print 'detail collector entered'

and print request in parseCatL1() prints <GET https://www.some.com> and if it is yielding request then why is it not printing detail collector entered

Below is the logs:

DEBUG: Crawled (200) https://www.justdial.com/Bangalore/AC-Compressor-Dealers/nct-10002128> (referer: https://www.justdial.com/Bangalore/311/11060105_3/AC-Compressor-Dealers_b2c)

DEBUG: l1 url entered is :https://www.justdial.com/Bangalore/AC-Compressor-Dealers/nct-10002128 l1 if entered https://www.justdial.com/Bangalore/AC-Compressor-Dealers-Tecumseh/nct-10002136 https://www.justdial.com/Bangalore/AC-Compressor-Dealers-Tecumseh/nct-10002136>

DEBUG: Filtered duplicate request: https://www.justdial.com/Bangalore/AC-Compressor-Dealers-Tecumseh/nct-10002136> - no more duplicates will be shown (see DUPEFILTER_DEBUG to show all duplicates)

DEBUG: Crawled (200) https://www.justdial.com/Bangalore/AC-Compressor-Dealers-Daikin/nct-10002130> (referer: https://www.justdial.com/Bangalore/311/11060105_3/AC-Compressor-Dealers_b2c)

DEBUG: l1 url entered is :https://www.justdial.com/Bangalore/AC-Compressor-Dealers-Daikin/nct-10002130 l1 if entered

0 个答案:

没有答案