查看urllib2的来源,看起来最简单的方法就是将HTTPRedirectHandler子类化,然后使用build_opener来覆盖默认的HTTPRedirectHandler,但这似乎很多(相对复杂)的工作做的似乎是它应该很简单。
答案 0 :(得分:133)
以下是Requests方式:
import requests
r = requests.get('http://github.com', allow_redirects=False)
print(r.status_code, r.headers['Location'])
答案 1 :(得分:34)
Dive Into Python有一个关于使用urllib2处理重定向的好章节。另一种解决方案是httplib。
>>> import httplib
>>> conn = httplib.HTTPConnection("www.bogosoft.com")
>>> conn.request("GET", "")
>>> r1 = conn.getresponse()
>>> print r1.status, r1.reason
301 Moved Permanently
>>> print r1.getheader('Location')
http://www.bogosoft.com/new/location
答案 2 :(得分:11)
这是一个不会遵循重定向的urllib2处理程序:
class NoRedirectHandler(urllib2.HTTPRedirectHandler):
def http_error_302(self, req, fp, code, msg, headers):
infourl = urllib.addinfourl(fp, headers, req.get_full_url())
infourl.status = code
infourl.code = code
return infourl
http_error_300 = http_error_302
http_error_301 = http_error_302
http_error_303 = http_error_302
http_error_307 = http_error_302
opener = urllib2.build_opener(NoRedirectHandler())
urllib2.install_opener(opener)
答案 3 :(得分:8)
我想这会有所帮助
from httplib2 import Http
def get_html(uri,num_redirections=0): # put it as 0 for not to follow redirects
conn = Http()
return conn.request(uri,redirections=num_redirections)
答案 4 :(得分:7)
redirections
请求方法中的httplib2
关键字是红色鲱鱼。如果收到重定向状态代码,则不会返回第一个请求,而是会引发RedirectLimit
异常。要返回初始回复,您需要在follow_redirects
对象上将False
设置为Http
:
import httplib2
h = httplib2.Http()
h.follow_redirects = False
(response, body) = h.request("http://example.com")
答案 5 :(得分:5)
我第二次指向Dive into Python的指针。这是使用urllib2重定向处理程序的实现,比它应该做的更多工作?也许,耸耸肩。
import sys
import urllib2
class RedirectHandler(urllib2.HTTPRedirectHandler):
def http_error_301(self, req, fp, code, msg, headers):
result = urllib2.HTTPRedirectHandler.http_error_301(
self, req, fp, code, msg, headers)
result.status = code
raise Exception("Permanent Redirect: %s" % 301)
def http_error_302(self, req, fp, code, msg, headers):
result = urllib2.HTTPRedirectHandler.http_error_302(
self, req, fp, code, msg, headers)
result.status = code
raise Exception("Temporary Redirect: %s" % 302)
def main(script_name, url):
opener = urllib2.build_opener(RedirectHandler)
urllib2.install_opener(opener)
print urllib2.urlopen(url).read()
if __name__ == "__main__":
main(*sys.argv)
答案 6 :(得分:5)
最短的路是
class NoRedirect(urllib2.HTTPRedirectHandler):
def redirect_request(self, req, fp, code, msg, hdrs, newurl):
pass
noredir_opener = urllib2.build_opener(NoRedirect())