Question

我正在关注pythonforbeginners.com上的教程，我遇到了一个在我的OSX上运行的代码。

from bs4 import BeautifulSoup
import urllib2
url = "http://www.pythonforbeginners.com"
content = urllib2.urlopen(url).read()
soup = BeautifulSoup(content)
print soup.prettify()

这给了我错误：

Traceback（最近一次调用最后一次）：文件＆＃34; / Users / dhruvmullick / CS / Python / Extracting Data / test.py＆＃34;，第8行，in content = urllib2.urlopen（url）.read（）File＆＃34; /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py"，第127行，在urlopen中 return _opener.open（url，data，timeout）File＆＃34; /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py"， 410号线，公开 response = meth（req，response）File＆＃34; /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py"，第523行，在http_response中＆＃39; http＆＃39;，请求，响应，代码，消息，hdrs）文件＆＃34; /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py"，第448行，错误 return self._call_chain（* args）File＆＃34; /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py"，第382行，在_call_chain中 result = func（* args）File＆＃34; /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py"，第531行，在http_error_default中引发HTTPError（req.get_full_url（），code，msg，hdrs，fp）urllib2.HTTPError：HTTP错误403：禁止

Answer 1

403 error表示服务器阻止了您的连接。

...来自客户端的网页或资源请求，表明可以访问服务器并理解请求，但拒绝采取任何进一步的行动。

尝试使用其他域名，您会发现它按预期工作。

要进行解决方法，您可以添加custom user-agent。

无法使用urllib2从网站提取数据

1 个答案: