urlgrabber出错

时间:2014-07-14 17:15:15

标签: python

以下用于从网址中检索图像的代码失败。 出于某种原因,它会抛出一个KeyboardInterrupt(???),即使我用try-catch包围它也会抛出我的脚本....

问题是,当网址存在时,为什么会失败?

>>> import urlgrabber
>>> urlgrabber.urlgrab('http://upload.wikimedia.org/wikipedia/en/thumb/e/e0/Passion_Flower.JPG/220px-Passion_Flower.JPG', filename='/home/eran/a.tmp', timeout = 2, retry = 2, reget = 'simple')

这会创建以下跟踪:

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/urlgrabber/grabber.py", line 1098, in _hdr_retrieve
    self.size = int(length)
ValueError: invalid literal for int() with base 10: 'Age, Content-Length, Date, X-Cache, X-Varnish\r\n'
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/urlgrabber/grabber.py", line 1098, in _hdr_retrieve
    self.size = int(length)
ValueError: invalid literal for int() with base 10: 'Age, Content-Length, Date, X-Cache, X-Varnish\r\n'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/urlgrabber/grabber.py", line 612, in urlgrab
    return default_grabber.urlgrab(url, filename, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/urlgrabber/grabber.py", line 976, in urlgrab
    return self._retry(opts, retryfunc, url, filename)
  File "/usr/local/lib/python2.7/dist-packages/urlgrabber/grabber.py", line 880, in _retry
    r = apply(func, (opts,) + args, {})
  File "/usr/local/lib/python2.7/dist-packages/urlgrabber/grabber.py", line 962, in retryfunc
    fo = PyCurlFileObject(url, filename, opts)
  File "/usr/local/lib/python2.7/dist-packages/urlgrabber/grabber.py", line 1056, in __init__
    self._do_open()
  File "/usr/local/lib/python2.7/dist-packages/urlgrabber/grabber.py", line 1308, in _do_open
    self._do_grab()
  File "/usr/local/lib/python2.7/dist-packages/urlgrabber/grabber.py", line 1438, in _do_grab
    self._do_perform()
  File "/usr/local/lib/python2.7/dist-packages/urlgrabber/grabber.py", line 1244, in _do_perform
    raise KeyboardInterrupt
KeyboardInterrupt

1 个答案:

答案 0 :(得分:1)

为什么不使用请求?我认为它更简单,实现了你想要的。您可以使用以下方式安装它:

pip install requests

,代码是:

>>> import requests
>>> r = requests.get('http://upload.wikimedia.org/wikipedia/en/thumb/e/e0/Passion_Flower.JPG/220px-Passion_Flower.JPG')
>>> if r.status_code == 200:
>>>     open('/tmp/flower.jpg', 'w').write(r.content)