Question

scrapy版本：0.20

问题：

start_urls=[URL1,URL2,URL3]

def parse(self,response):
    //suppose URL2 is redirected to other URL
    //I need to get current start URL(before redirection)

我尝试过response.request.url，但它与response.url

相同

请帮帮我

Answer 1

如果您启用了RedirectMiddleware（默认情况下应该启用），您可以尝试：

original_url = response.meta.get('redirect_urls', [response.url])[0]

有关实施细节，请参阅https://github.com/scrapy/scrapy/blob/master/scrapy/downloadermiddlewares/redirect.py#L35

如何在scrapy中进行重定向时获取旧URL？

1 个答案: