使用lambda上传到S3的JPG文件已损坏

时间:2019-08-30 14:40:55

标签: python amazon-s3 aws-lambda

我有这个简单的python lambda,可以下载JPG图片并将其上传到S3存储桶。

url = 'https://somesite.com/11/frame.jpg?abs_begin=2019-08-29T05:18:26Z'

s3 = boto3.client('s3')

with contextlib.closing(requests.get(url, stream=True, verify=False)) as response:

    fp = BytesIO(response.content)

    s3.upload_fileobj(fp, bucket_name, 'my-dir/' + 'test_img.jpg')

但是,在我的存储桶中查看时,文件大小为162个字节。从浏览器GUI将其下载到我的本地磁盘时,macOS会提示:The file "test_img.jpg" could not be opened.It may be damaged or use a file format that Preview doesn’t recognise.

你知道是什么原因造成的吗?

1 个答案:

答案 0 :(得分:1)

您确定该网站正在为您提供JPEG文件吗?我建议以某种方式检查response.status_code,我通常只是在其中放一个raise_for_status(),然后让调用代码处理异常

另外,如果您实际上正在流传输内容,则只需要传递stream=True,就可以一次阅读所有内容,并且请求流是浪费的。建议使用流式传输,否则您需要将整个文件保存在内存中,这可能是浪费的

如果您想检查自己是否正在获取图像,可以在上载到S3之前使用Pillow打开图像,例如:

import tempfile

import requests
from PIL import Image  # pip install -U Pillow

# dummy image
url = 'https://picsum.photos/id/1053/1500/1000'

# get a temp file in case we get a large image
with tempfile.TemporaryFile() as fd:
    with requests.get(url, stream=True) as response:
        # make sure HTTP request succeeded
        response.raise_for_status()

        for data in response.iter_content(8192):
            fd.write(data)

    # seek back to beginning of file and load to make sure it's OK
    fd.seek(0)
    with Image.open(fd) as img:
        # will raise an exception on failure
        img.verify()
        print(f'got a {img.format} image of size {img.size}' )

    # let S3 do its thing
    s3.upload_fileobj(fd, bucket_name, 'my-dir/test_img.jpg')