使用PIL和请求下载图像

时间:2016-06-10 15:24:09

标签: python python-2.7 python-3.x python-requests

我正在尝试通过url下载原始图像(png格式),即时转换(不保存到光盘)并保存为jpg。

代码如下:

import os
import io
import requests
from PIL import Image
...
r = requests.get(img_url, stream=True)
if r.status_code == 200:
    i = Image.open(io.BytesIO(r.content))
    i.save(os.path.join(out_dir, 'image.jpg'), quality=85)

它有效,但当我尝试用r.iter_content()监视下载进度(对于未来的进度条)时,这样:

r = requests.get(img_url, stream=True)
if r.status_code == 200:
    for chunk in r.iter_content():
        print(len(chunk))
    i = Image.open(io.BytesIO(r.content))
    i.save(os.path.join(out_dir, 'image.jpg'), quality=85)

我收到此错误:

Traceback (most recent call last):
  File "E:/GitHub/geoportal/quicklookScrape/temp.py", line 37, in <module>
    i = Image.open(io.BytesIO(r.content))
  File "C:\Python35\lib\site-packages\requests\models.py", line 736, in content
    'The content for this response was already consumed')
RuntimeError: The content for this response was already consumed

那么是否有可能监控下载进度并在获取数据后呢?

1 个答案:

答案 0 :(得分:3)

使用r.iter_content()时,您需要在某处缓冲结果。不幸的是,我找不到任何内容被附加到内存中的对象的示例 - 通常,当文件不能或不应该一次完全加载到内存中时使用iter_content。但是,您可以使用tempfile.SpooledTemporaryFile缓冲它,如本答案中所述:https://stackoverflow.com/a/18550652/4527093。这将阻止将图像保存到磁盘(除非图像大于指定的max_size)。然后,您可以从Image创建tempfile

import os
import io
import requests
from PIL import Image
import tempfile

buffer = tempfile.SpooledTemporaryFile(max_size=1e9)
r = requests.get(img_url, stream=True)
if r.status_code == 200:
    downloaded = 0
    filesize = int(r.headers['content-length'])
    for chunk in r.iter_content():
        downloaded += len(chunk)
        buffer.write(chunk)
        print(downloaded/filesize)
    buffer.seek(0)
    i = Image.open(io.BytesIO(buffer.read()))
    i.save(os.path.join(out_dir, 'image.jpg'), quality=85)
buffer.close()