Question

我正在处理用python 2.7x编写的AWS Lambda函数，该函数下载，保存到/ tmp，然后将图像文件上传回存储桶。

我的图像元数据从原始存储桶开始，包含像Content-Type = image / jpeg等http标题。

使用PIL保存我的图像后，所有标题都消失了，我留下了Content-Type = binary / octet-stream

据我所知，由于PIL的工作方式，image.save正在丢失标题。如何保留元数据或至少将其应用于新保存的图像？

我看过post建议这个元数据是exif但我试图从原始文件中获取exif信息并应用于保存的文件而没有运气。无论如何，我还不清楚exif数据。

部分代码，让我们了解我在做什么：

def resize_image(image_path):
    with Image.open(image_path) as image:
    image.save(upload_path, optimize=True)

def handler(event, context):
    global upload_path
    for record in event['Records']:
        bucket = record['s3']['bucket']['name']
        key = urllib.unquote_plus(event['Records'][0]['s3']['object']['key'].encode("utf8"))

        download_path = '/tmp/{}{}'.format(uuid.uuid4(), file_name)
        upload_path = '/tmp/resized-{}'.format(file_name)

        s3_client.download_file(bucket, key, download_path)

        resize_image(download_path)
        s3_client.upload_file(upload_path, '{}resized'.format(bucket), key)

感谢谢尔盖，我改为使用get_object，但响应缺少元数据：

response = s3_client.get_object(Bucket=bucket,Key=key)

response = {u'Body':, u'AcceptRanges'：'bytes'，u'ContentType'：'image / jpeg'，'ResponseMetadata'：{'HTTPStatusCode'：200，'RetryAttempts'：0，' HOSTID '： 'au30hBMN37 / ti0WCfDqlb3t9ehainumc9onVYWgu + CsrHtvG0u / zmgcOIvCCBKZgQrGoooZoW9o ='， '的requestId'： '1A94D7F01914A787'， 'HTTPHeaders'：{' 内容长度 '： '84053'， 'X-AMZ-ID-2'：' au30hBMN37 / ti0WCfDqlb3t9ehainumc9onVYWgu + CsrHtvG0u / zmgcOIvCCBKZgQrGoooZoW9o ='，'accept-ranges'：'bytes'，'expires'：'Sun，01 Jan 0134 00:00:00 GMT'，'server'：'AmazonS3'，'last-modified'： '周五，2016年12月23日15:21:56 GMT'，'x-amz-request-id'：'1A94D7F01914A787'，'etag'：'“9ba59e5457da0dc40357f2b53715619d”'，'cache-control'：'max-age = 2592000 ，public'，'date'：'Fri，2016年12月23日15:21:58 GMT'，'content-type'：'image / jpeg'}}，u'LastModified'：datetime.datetime（2016,12,23 ，15,21,56，tzinfo = tzutc（）），u'ContentLength'：84053，u'Expires'：datetime.datetime（2034,1,1,0,0，tzinfo = tzutc（）），u'ETag '：'“9ba59e5457da0dc40357f2b53715619d”'，u'CacheControl'：'max-age = 2592000，pu blic'，u'Metadata'：{}}

如果我使用： metadata = response ['ResponseMetadata'] ['HTTPHeaders']

metadata = {'content-length'：'84053'，'x-amz-id-2'：'f5UAhWzx7lulo3cMVV8hdVRbHnhdnjHWRDl + LDFkYm9pubjL0A01L5yWjgDjWRE4TjRnjqDeA0U ='，'accept-ranges'：'bytes'，'expires'：'Sun， 01年1月1日00:00:00 GMT'，'服务器'：'AmazonS3'，'最后修改'：'周五，2016年12月23日15:47:09 GMT'，'x-amz-request-id'：' 4C69DF8A58EF3380'，'etag'：'“9ba59e5457da0dc40357f2b53715619d”'，'cache-control'：'max-age = 2592000，public'，'date'：'Fri，23 Dec 2016 15:47:10 GMT'，'content-输入'：'image / jpeg'}

使用put_object保存

s3_client.put_object(Bucket=bucket+'resized',Key=key, Metadata=metadata, Body=downloadfile)

在s3中创建了大量额外的元数据，包括它不将内容类型保存为image / jpeg而是将其保存为二进制/八位字节流，并且它确实创建了元数据x-amz-meta-content-type =图像/ JPEG

Answer 1

您将AWS S3存储的S3元数据与对象以及存储在文件本身内的EXIF元数据混淆。

download_file()没有从S3获取对象属性。您应该使用get_object()代替：https://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.Client.get_object

然后，您可以使用具有相同属性的put_objects()上传新文件：https://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.Client.put_object

Answer 2

内容类型信息不在您上传的文件中，必须以某种方式猜测或提取。这是您必须手动或使用工具执行的操作。使用fairly small dictionary，您可以猜测大多数文件类型。

上载文件或对象时，您可以指定其内容类型。否则，S3默认为application/octet-stream。

例如使用boto3 python软件包：

s3client.upload_file(
    Filename=local_path,
    Bucket=bucket,
    Key=remote_path,
    ExtraArgs={
        "ContentType": "image/jpeg"
    }
)

AWS S3映像保存会丢失元数据

2 个答案: