scrapy版本:0.20
问题:
start_urls=[URL1,URL2,URL3]
def parse(self,response):
//suppose URL2 is redirected to other URL
//I need to get current start URL(before redirection)
我尝试过response.request.url,但它与response.url
相同请帮帮我
答案 0 :(得分:10)
如果您启用了RedirectMiddleware
(默认情况下应该启用),您可以尝试:
original_url = response.meta.get('redirect_urls', [response.url])[0]
有关实施细节,请参阅https://github.com/scrapy/scrapy/blob/master/scrapy/downloadermiddlewares/redirect.py#L35