Question

在 Python 3 中，以下代码获取网页的HTML源代码。

import urllib.request
url = "https://docs.python.org/3.4/howto/urllib2.html"
response = urllib.request.urlopen(url)

response.read()

使用urllib.request时，如何将以下自定义标头添加到请求中？

headers = { 'User-Agent' : 'Mozilla/5.0 (Windows NT 6.1; Win64; x64)' }

Answer 1

可以通过首先创建请求对象然后将其提供给urlopen来自定义请求标头。

import urllib.request
url = "https://docs.python.org/3.4/howto/urllib2.html"
hdr = { 'User-Agent' : 'Mozilla/5.0 (Windows NT 6.1; Win64; x64)' }

req = urllib.request.Request(url, headers=hdr)
response = urllib.request.urlopen(req)
response.read()

来源：Python 3.4 Documentation

Answer 2

import urllib.request

opener = urllib.request.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
urllib.request.install_opener(opener)
response = urllib.request.urlopen("url")
response.read()

如果您想了解详细信息，可以参考python文档：https://docs.python.org/3/library/urllib.request.html

Answer 3

#Using urllib.request, with urlopen, allows to open the specified URL.
#Headers can be included inside the urlopen along with the url.

from urllib.request import urlopen
url = "https://docs.python.org/3.4/howto/urllib2.html"
header = { 'User-Agent' : 'Mozilla/5.0 (Windows NT 6.1; Win64; x64)' }
response = urlopen(url, headers=header)
response.read()

Python 3 - 向urllib.request请求添加自定义标头

3 个答案: