我使用urllib.request
和python 3.4.6打开https://www.ethz.ch/
(实际的网址更长,但问题是相同的),这对Firefox打开很好,但会引发404错误用python。
这是代码
from urllib.request import urlopen
connection = urlopen('https://www.ethz.ch/')
并且它给出以下错误消息
Traceback (most recent call last):
File "./generate_group_meetings_ical.py", line 9, in <module>
connection = urlopen('https://www.ethz.ch/')
File "/usr/lib64/python3.4/urllib/request.py", line 161, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib64/python3.4/urllib/request.py", line 470, in open
response = meth(req, response)
File "/usr/lib64/python3.4/urllib/request.py", line 580, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib64/python3.4/urllib/request.py", line 508, in error
return self._call_chain(*args)
File "/usr/lib64/python3.4/urllib/request.py", line 442, in _call_chain
result = func(*args)
File "/usr/lib64/python3.4/urllib/request.py", line 588, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not found UA
以前的代码工作正常。另一条信息是我在机器上没有root,python3从3.4.5升级到3.4.6。所以来自Web服务器端或来自python端。我不是蟒蛇专家,也不是网络专家,所以我自己也无法弄清楚。
希望有人能帮助我。
答案 0 :(得分:1)
感谢Francisco的评论和that post我可以使用以下代码
from urllib.request import Request, urlopen
req = Request('https://www.ethz.ch/', headers={'User-Agent': 'Mozilla/5.0'})
connection = urlopen(req)
我还使用python 2.7.13和urllib2检查了原始版本并且它工作正常。显然python 3.5工作(来自Laxmikant的答案),它最初在3.4.5下工作。因此,从3.4.5到3.4.6的升级发生了一些导致错误的事情。
答案 1 :(得分:0)
@Pheidippides检查你的整个网址是否存在拼写错误,它对我有用:
Python 3.5.2 (default, Nov 17 2016, 17:05:23)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
$>from urllib.request import urlopen
$>connection = urlopen('https://www.ethz.ch/')
$>connection.read()