如何使用python boto3更新AWS S3中现有对象的元数据?

时间:2016-09-20 14:36:52

标签: python amazon-web-services amazon-s3 boto3

boto3文档没有明确说明如何更新已存在的S3对象的用户元数据。

4 个答案:

答案 0 :(得分:5)

可以使用copy_from()方法完成 -

import boto3

s3 = boto3.resource('s3')
s3_object = s3.Object('bucket-name', 'key')
s3_object.metadata.update({'id':'value'})
s3_object.copy_from(CopySource={'Bucket':'bucket-name', 'Key':'key'}, Metadata=s3_object.metadata, MetadataDirective='REPLACE')

答案 1 :(得分:2)

与此 answer 类似,但保留现有元数据,同时仅修改所需内容。从系统定义的元数据中,我在这个例子中只保留了 ContentType 和 ContentDisposition。其他系统定义的元数据也可以类似的保存。

import boto3

s3 = boto3.client('s3')
response = s3.head_object(Bucket=bucket_name, Key=object_name)
response['Metadata']['new_meta_key'] = "new_value"
response['Metadata']['existing_meta_key'] = "new_value"
result = s3.copy_object(Bucket=bucket_name, Key=object_name,
                        CopySource={'Bucket': bucket_name,
                                    'Key': object_name},
                        Metadata=response['Metadata'],
                        MetadataDirective='REPLACE', TaggingDirective='COPY',
                        ContentDisposition=response['ContentDisposition'],
                        ContentType=response['ContentType'])

答案 2 :(得分:1)

您可以使用资源上的copy_from()(例如this answer)来做到这一点,但也可以使用客户端的copy_object()并指定相同的源和目标。这些方法是等效的,并在下面调用相同的代码。

import boto3
s3 = boto3.client("s3")
src_key = "my-key"
src_bucket = "my-bucket"
s3.copy_object(Key=src_key, Bucket=src_bucket,
               CopySource={"Bucket": src_bucket, "Key": src_key},
               Metadata={"my_new_key": "my_new_val"},
               MetadataDirective="REPLACE")

“ REPLACE”表示请求中的新元数据将完全覆盖源元数据。如果您只想 add 个新键值,则必须先阅读原始键值并进行更新。

您可以使用head_object(Key=src_key, Bucket=src_bucket)检索原始元数据。如果这是您采用的路由,则还应该在CopySourceIfMatch=original_etag请求中传递copy_object(...),以保留与单次调用版本相同的原子性属性。如果要在读写之间更改对象,则copy_object会失败,并显示http 412错误。

参考:boto3 issue 389

答案 3 :(得分:0)

您可以通过添加内容或使用新内容更新当前元数据值来更新元数据,这是我正在使用的代码段:

import sys
import os 
import boto3
import pprint
from boto3 import client
from botocore.utils import fix_s3_host
param_1= YOUR_ACCESS_KEY
param_2= YOUR_SECRETE_KEY
param_3= YOUR_END_POINT 
param_4= YOUR_BUCKET

#Create the S3 client
s3ressource = client(
    service_name='s3', 
    endpoint_url= param_3,
    aws_access_key_id= param_1,
    aws_secret_access_key=param_2,
    use_ssl=True,
    )
# Building a list of of object per bucket
def BuildObjectListPerBucket (variablebucket):
    global listofObjectstobeanalyzed
    listofObjectstobeanalyzed = []
    extensions = ['.jpg','.png']
    for key  in s3ressource.list_objects(Bucket=variablebucket)["Contents"]:
        #print (key ['Key'])
        onemoreObject=key['Key']
        if onemoreObject.endswith(tuple(extensions)):
            listofObjectstobeanalyzed.append(onemoreObject)
    #print listofObjectstobeanalyzed
        else :
            s3ressource.delete_object(Bucket=variablebucket,Key=onemoreObject)          
    return listofObjectstobeanalyzed

# for a given existing object, create metadata
def createmetdata(bucketname,objectname):
    s3ressource.upload_file(objectname, bucketname, objectname, ExtraArgs={"Metadata": {"metadata1":"ImageName","metadata2":"ImagePROPERTIES" ,"metadata3":"ImageCREATIONDATE"}})

# for a given existing object, add new metadata
def ADDmetadata(bucketname,objectname):
    s3_object = s3ressource.get_object(Bucket=bucketname, Key=objectname)
    k = s3ressource.head_object(Bucket = bucketname, Key = objectname)
    m = k["Metadata"]
    m["new_metadata"] = "ImageNEWMETADATA"
    s3ressource.copy_object(Bucket = bucketname, Key = objectname, CopySource = bucketname + '/' + objectname, Metadata = m, MetadataDirective='REPLACE')

# for a given existing object, update  a metadata with new value
def CHANGEmetadata(bucketname,objectname):
    s3_object = s3ressource.get_object(Bucket=bucketname, Key=objectname)
    k = s3ressource.head_object(Bucket = bucketname, Key = objectname)
    m = k["Metadata"]
    m.update({'watson_visual_rec_dic':'ImageCREATIONDATEEEEEEEEEEEEEEEEEEEEEEEEEE'})
    s3ressource.copy_object(Bucket = bucketname, Key = objectname, CopySource = bucketname + '/' + objectname, Metadata = m, MetadataDirective='REPLACE')

def readmetadata (bucketname,objectname):
    ALLDATAOFOBJECT = s3ressource.get_object(Bucket=bucketname, Key=objectname)
    ALLDATAOFOBJECTMETADATA=ALLDATAOFOBJECT['Metadata']
    print ALLDATAOFOBJECTMETADATA



# create the list of object on a per bucket basis
BuildObjectListPerBucket (param_4)

# Call functions to see the results 
for objectitem in listofObjectstobeanalyzed:
    # CALL The function you want 
    readmetadata(param_4,objectitem)
    ADDmetadata(param_4,objectitem)
    readmetadata(param_4,objectitem)
    CHANGEmetadata(param_4,objectitem)
    readmetadata(param_4,objectitem)