从未调用过Scrapy Request回调方法

时间:2014-04-22 04:35:42

标签: python-2.7 request scrapy

我正在使用Scrapy 0.22.2为Python 2.7.3构建一个CrawlSpider,并且遇到了Requests问题,我指定的回调方法从未被调用过。这是我的解析方法的一个片段,它在elif块中启动一个Request:

elif current_status == "Superseded":
        #Need to do more work here. Have to check whether there is a replacement unit available. If there isn't, download whatever outline is there
        # We need to look for a <td> element which contains "Is superseded by " and follow that link
        updated_unit = hxs.xpath('/html/body/div[@id="page"]/div[@id="layoutWrapper"]/div[@id="twoColLayoutWrapper"]/div[@id="twoColLayoutLeft"]/div[@class="layoutContentWrapper"]/div[@class="outer"]/div[@class="fieldset"]/div[@class="display-row"]/div[@class="display-row"]/div[@class="display-field-info"]/div[@class="t-widget t-grid"]/table/tbody/tr[1]/td[contains(., "Is superseded by ")]/a')
        # need child element a
        updated_unit_link = updated_unit.xpath('@href').extract()[0]
        updated_url = "http://training.gov.au" + updated_unit_link
        print "\033[0;31mSuperceded by "+updated_url+"\033[0m" # prints in Red for superseded, need to follow this link to current
        yield Request(url=updated_url, callback='sortSuperseded', dont_filter=True)

def sortSuperseded(self, response):
    print "\033[0;35mtest callback called\033[0m"

执行此操作时没有错误,并且url正常,但是从未调用sortSuperseded,因为我从未看到名为&#39;的测试回调。打印在控制台中。

我提取的网址也在我为CrawlSpider指定的域中。

allowed_domains = ["training.gov.au"]

我哪里错了?

1 个答案:

答案 0 :(得分:0)

回调方法名称周围不需要引号。改变这一行:

yield Request(url=updated_url, callback='sortSuperseded', dont_filter=True)

yield Request(updated_url, callback=self.sortSuperseded, dont_filter=True)