有没有办法使用http.client
模块自动跟踪重定向:
def grab_url(host, path = '/'):
class Data: pass
result = Data()
try:
con = http.client.HTTPConnection(host)
con.request('GET', path)
response = con.getresponse()
if response.status == 200:
result.content = response.read().decode('utf-8')
result.headers = response.getheaders()
catch Exception as e:
print(e)
return result
只要请求返回200
的http响应,上述方法就有效,但我不知道如何处理301
等重定向?
使用pyCurl
我只需将FOLLOWLOCATION
设置为True
:
def grab_url(host, path = '/'):
buffer = BytesIO()
c = pyCurl.Curl()
c.setopt(c.FOLLOWLOCATION, True)
c.setopt(c.URL, host + path)
c.setopt(c.WRITEDATA, buffer)
c.perform()
status = c.getinfo(c.RESPONSE_CODE)
if status == 200:
return buffer.getvalue().decode('iso-8859-1')