Python 3.5无法打开url-错误(http 403)

时间:2016-07-30 22:36:50

标签: python-2.7 python-3.x urllib

我正在尝试打开并解析Python 3.5中的以下URL,以收集我的作业的一些注释。这是我的代码:

 Traceback (most recent call last):
      File "/Users/maryamzolnoori/Dropbox/Dissertation/Programming/Web-Crawl/Askapatient_collect_comments.py", line 12, in <module>
        home_page = urlopen(req).read()
      File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 163, in urlopen
        return opener.open(url, data, timeout)
      File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 472, in open
        response = meth(req, response)
      File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 582, in http_response
        'http', request, response, code, msg, hdrs)
      File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 510, in error
        return self._call_chain(*args)
      File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 444, in _call_chain
        result = func(*args)
      File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 590, in http_error_default
        raise HTTPError(req.full_url, code, msg, hdrs, fp)
    urllib.error.HTTPError: HTTP Error 403: Forbidden

这就是错误:

urllib2.HTTPError: HTTP Error 416: Requested Range Not Satisfiable

我甚至在python 2.7中测试过它并且失败了。错误是:

{{1}}

1 个答案:

答案 0 :(得分:1)

你得到403被禁止,很可能是由于用户代理是python。尝试设置用户代理,就像您是浏览器一样。

例如:

from urllib.request import Request, urlopen
url = "http://www.webmd.com/drugs/drugreview-35-Zoloft+oral.aspx?drugid=35&drugname=Zoloft+oral&conditionFilter=-500"
req = Request(
    url, 
    data=None, 
    headers={
        'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'
    }
)

home_page = urlopen(req)
print(home_page.read().decode('utf-8'))

使用适当的编码也是一个好主意。