Python 3 - 向urllib.request请求添加自定义标头

时间:2017-10-31 06:40:17

标签: python-3.x web-crawler python-requests urllib

Python 3 中,以下代码获取网页的HTML源代码。

import urllib.request
url = "https://docs.python.org/3.4/howto/urllib2.html"
response = urllib.request.urlopen(url)

response.read()

使用urllib.request时,如何将以下自定义标头添加到请求中?

headers = { 'User-Agent' : 'Mozilla/5.0 (Windows NT 6.1; Win64; x64)' }

3 个答案:

答案 0 :(得分:9)

可以通过首先创建请求对象然后将其提供给urlopen来自定义请求标头。

import urllib.request
url = "https://docs.python.org/3.4/howto/urllib2.html"
hdr = { 'User-Agent' : 'Mozilla/5.0 (Windows NT 6.1; Win64; x64)' }

req = urllib.request.Request(url, headers=hdr)
response = urllib.request.urlopen(req)
response.read()

来源:Python 3.4 Documentation

答案 1 :(得分:3)

import urllib.request

opener = urllib.request.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
urllib.request.install_opener(opener)
response = urllib.request.urlopen("url")
response.read()

如果您想了解详细信息,可以参考python文档:https://docs.python.org/3/library/urllib.request.html

答案 2 :(得分:1)

#Using urllib.request, with urlopen, allows to open the specified URL.
#Headers can be included inside the urlopen along with the url.

from urllib.request import urlopen
url = "https://docs.python.org/3.4/howto/urllib2.html"
header = { 'User-Agent' : 'Mozilla/5.0 (Windows NT 6.1; Win64; x64)' }
response = urlopen(url, headers=header)
response.read()