以下用于从网址中检索图像的代码失败。 出于某种原因,它会抛出一个KeyboardInterrupt(???),即使我用try-catch包围它也会抛出我的脚本....
问题是,当网址存在时,为什么会失败?
>>> import urlgrabber
>>> urlgrabber.urlgrab('http://upload.wikimedia.org/wikipedia/en/thumb/e/e0/Passion_Flower.JPG/220px-Passion_Flower.JPG', filename='/home/eran/a.tmp', timeout = 2, retry = 2, reget = 'simple')
这会创建以下跟踪:
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/urlgrabber/grabber.py", line 1098, in _hdr_retrieve
self.size = int(length)
ValueError: invalid literal for int() with base 10: 'Age, Content-Length, Date, X-Cache, X-Varnish\r\n'
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/urlgrabber/grabber.py", line 1098, in _hdr_retrieve
self.size = int(length)
ValueError: invalid literal for int() with base 10: 'Age, Content-Length, Date, X-Cache, X-Varnish\r\n'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/urlgrabber/grabber.py", line 612, in urlgrab
return default_grabber.urlgrab(url, filename, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/urlgrabber/grabber.py", line 976, in urlgrab
return self._retry(opts, retryfunc, url, filename)
File "/usr/local/lib/python2.7/dist-packages/urlgrabber/grabber.py", line 880, in _retry
r = apply(func, (opts,) + args, {})
File "/usr/local/lib/python2.7/dist-packages/urlgrabber/grabber.py", line 962, in retryfunc
fo = PyCurlFileObject(url, filename, opts)
File "/usr/local/lib/python2.7/dist-packages/urlgrabber/grabber.py", line 1056, in __init__
self._do_open()
File "/usr/local/lib/python2.7/dist-packages/urlgrabber/grabber.py", line 1308, in _do_open
self._do_grab()
File "/usr/local/lib/python2.7/dist-packages/urlgrabber/grabber.py", line 1438, in _do_grab
self._do_perform()
File "/usr/local/lib/python2.7/dist-packages/urlgrabber/grabber.py", line 1244, in _do_perform
raise KeyboardInterrupt
KeyboardInterrupt
答案 0 :(得分:1)
为什么不使用请求?我认为它更简单,实现了你想要的。您可以使用以下方式安装它:
pip install requests
,代码是:
>>> import requests
>>> r = requests.get('http://upload.wikimedia.org/wikipedia/en/thumb/e/e0/Passion_Flower.JPG/220px-Passion_Flower.JPG')
>>> if r.status_code == 200:
>>> open('/tmp/flower.jpg', 'w').write(r.content)