Scrapy简单规则不遵循链接

时间:2015-05-17 03:02:30

标签: python scrapy

我有一个非常简单的Scrapy foo(*students['John']),我给它一个简单的规则"抓取/关注包含' / search / listings'"的任何链接。但蜘蛛没有爬行/跟踪任何这些链接?

我已确认起始网址包含许多与href' / search / listings'所以链接就在那里。

任何想法都会出错?

CrawlSpider

开始网址" http://www.mywebsite.com/results"包含我希望规则适用于的这些链接:

class MySpider(CrawlSpider):

    name = "MySpider"
    allowed_domains = ["mywebsite.com"]
    start_urls = ["http://www.mywebsite.com/results"]
    rules = [Rule(LinkExtractor(allow=['/search/listings(.*)']), callback="parse2")]

    def parse2(self, response):

        # This function is never called
        log.start("log.txt")
        log.msg("Page crawled: " + response.url)

0 个答案:

没有答案