urllib错误:无法获取html数据。 Python,Beagle Bone Black

时间:2013-11-12 06:31:21

标签: python urllib beagleboneblack

我在mac上制作我的项目,我试图通过Beagle Bone Black(BBB)做同样的事情。 但是,我无法在BBB中使用urllib,所以我陷入了困境:我无法继续前进。(它在我的mac中运行良好)

我尝试了这个简单的代码作为例子:

import urllib
conn = urllib.urlopen('http://stackoverflow.com/questions/8479736/using-python-urllib-how-to-avoid-non-html-content')

然后发生此错误:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/urllib.py", line 86, in urlopen
    return opener.open(url)
  File "/usr/lib/python2.7/urllib.py", line 207, in open
    return getattr(self, name)(url)
  File "/usr/lib/python2.7/urllib.py", line 351, in open_http
    'got a bad status line', None)
IOError: ('http protocol error', 0, 'got a bad status line', None)

我需要为我的项目获取一个html数据。 我怎么解决这个问题?你有什么想法 ? 谢谢。

当我尝试urllib2时 我明白了:

>>> import urllib2
>>> conn = urllib2.urlopen('http://stackoverflow.com/questions/8479736/using-python-urllib-how-to-avoid-non-html-content')

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.7/urllib2.py", line 400, in open
    response = self._open(req, data)
  File "/usr/lib/python2.7/urllib2.py", line 418, in _open
    '_open', req)
  File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.7/urllib2.py", line 1207, in http_open
    return self.do_open(httplib.HTTPConnection, req)
  File "/usr/lib/python2.7/urllib2.py", line 1180, in do_open
    r = h.getresponse(buffering=True)
  File "/usr/lib/python2.7/httplib.py", line 1030, in getresponse
    response.begin()
  File "/usr/lib/python2.7/httplib.py", line 407, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python2.7/httplib.py", line 371, in _read_status
    raise BadStatusLine(line)
httplib.BadStatusLine: ''

我也尝试过这个:

curl http://stackoverflow.com/questions/8479736/using-python-urllib-how-to-avoid-non-html-content
curl: (52) Empty reply from server

和此:

wget http://stackoverflow.com/questions/8479736/using-python-urllib-how-to-avoid-non-html-content
Connecting to stackoverflow.com (198.252.206.16:80)
wget: error getting response

但他们没有工作

在家里,我也尝试过但失败但却返回了一个不同的错误:

conn = urllib2.urlopen('http://stackoverflow.com/questions/8479736/using-python-urllib-how-to-avoid-non-html-content')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.7/urllib2.py", line 400, in open
    response = self._open(req, data)
  File "/usr/lib/python2.7/urllib2.py", line 418, in _open
    '_open', req)
  File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.7/urllib2.py", line 1215, in https_open
    return self.do_open(httplib.HTTPSConnection, req)
  File "/usr/lib/python2.7/urllib2.py", line 1177, in do_open
    raise URLError(err)
urllib2.URLError: <urlopen error [Errno -2] Name or service not known>

环境

BBB: Linux beaglebone 3.8.13 #1 SMP Tue Jun 18 02:11:09 EDT 2013 armv7l GNU/Linux
python version: 2.7.3

1 个答案:

答案 0 :(得分:-1)

我真的很想推荐你requests lib:

>>> r = requests.get('https://api.github.com/user', auth=('user', 'pass'))
>>> r.status_code
200
>>> r.headers['content-type']
'application/json; charset=utf8'
>>> r.encoding
'utf-8'
>>> r.text
u'{"type":"User"...'

http://www.python-requests.org/en/latest/

如何安装:

sudo pip install requests