如何在scrapy中进行重定向时获取旧URL?

时间:2014-03-19 11:47:23

标签: redirect scrapy

scrapy版本:0.20

问题:

start_urls=[URL1,URL2,URL3]

def parse(self,response):
    //suppose URL2 is redirected to other URL
    //I need to get current start URL(before redirection) 

我尝试过response.request.url,但它与response.url

相同

请帮帮我

1 个答案:

答案 0 :(得分:10)

如果您启用了RedirectMiddleware(默认情况下应该启用),您可以尝试:

original_url = response.meta.get('redirect_urls', [response.url])[0]

有关实施细节,请参阅https://github.com/scrapy/scrapy/blob/master/scrapy/downloadermiddlewares/redirect.py#L35