Question

我正在使用Scrapy 0.22.2为Python 2.7.3构建一个CrawlSpider，并且遇到了Requests问题，我指定的回调方法从未被调用过。这是我的解析方法的一个片段，它在elif块中启动一个Request：

elif current_status == "Superseded":
        #Need to do more work here. Have to check whether there is a replacement unit available. If there isn't, download whatever outline is there
        # We need to look for a <td> element which contains "Is superseded by " and follow that link
        updated_unit = hxs.xpath('/html/body/div[@id="page"]/div[@id="layoutWrapper"]/div[@id="twoColLayoutWrapper"]/div[@id="twoColLayoutLeft"]/div[@class="layoutContentWrapper"]/div[@class="outer"]/div[@class="fieldset"]/div[@class="display-row"]/div[@class="display-row"]/div[@class="display-field-info"]/div[@class="t-widget t-grid"]/table/tbody/tr[1]/td[contains(., "Is superseded by ")]/a')
        # need child element a
        updated_unit_link = updated_unit.xpath('@href').extract()[0]
        updated_url = "http://training.gov.au" + updated_unit_link
        print "\033[0;31mSuperceded by "+updated_url+"\033[0m" # prints in Red for superseded, need to follow this link to current
        yield Request(url=updated_url, callback='sortSuperseded', dont_filter=True)

def sortSuperseded(self, response):
    print "\033[0;35mtest callback called\033[0m"

执行此操作时没有错误，并且url正常，但是从未调用sortSuperseded，因为我从未看到名为＆＃39;的测试回调。打印在控制台中。

我提取的网址也在我为CrawlSpider指定的域中。

allowed_domains = ["training.gov.au"]

我哪里错了？

Answer 1

回调方法名称周围不需要引号。改变这一行：

yield Request(url=updated_url, callback='sortSuperseded', dont_filter=True)

到

yield Request(updated_url, callback=self.sortSuperseded, dont_filter=True)

从未调用过Scrapy Request回调方法

1 个答案: