如何使用boto在两个Amazon S3存储桶之间移动文件?

时间:2015-05-11 07:17:31

标签: python amazon-s3 boto

我必须使用Python Boto API在一个存储桶之间移动文件。 (我需要它从第一个Bucket中“剪切”文件并在第二个Bucket中“粘贴”它。 最好的方法是什么?

**注意:如果我有两个不同的ACCESS KEYS和SECRET KEYS,那有关系吗?

6 个答案:

答案 0 :(得分:21)

我认为boto S3文档可以回答你的问题。

https://github.com/boto/boto/blob/develop/docs/source/s3_tut.rst

通过boto将文件从一个存储桶移动到另一个存储桶实际上是从源到目标的密钥副本,而不是从源中删除密钥。

您可以访问存储桶:

import boto

c = boto.connect_s3()
src = c.get_bucket('my_source_bucket')
dst = c.get_bucket('my_destination_bucket')

并迭代键:

for k in src.list():
    # copy stuff to your destination here
    dst.copy_key(k.key.name, src.name, k.key.name)
    # then delete the source key
    k.delete()

另请参阅:Is it possible to copy all files from one S3 bucket to another with s3cmd?

答案 1 :(得分:13)

如果你使用的是boto3(较新的boto版本),这很简单

import boto3
s3 = boto3.resource('s3')
copy_source = {
    'Bucket': 'mybucket',
    'Key': 'mykey'
}
s3.meta.client.copy(copy_source, 'otherbucket', 'otherkey')

Docs

答案 2 :(得分:5)

对于我来说,awscli的工作比boto应对和删除每个键快30倍。可能是由于awscli中的多线程。如果您仍想从python脚本运行它而不从中调用shell命令,您可以尝试这样的事情:

安装awscli python包:

sudo pip install awscli

然后就这么简单:

import os
if os.environ.get('LC_CTYPE', '') == 'UTF-8':
    os.environ['LC_CTYPE'] = 'en_US.UTF-8'

from awscli.clidriver import create_clidriver
driver = create_clidriver()
driver.main('s3 mv source_bucket target_bucket --recursive'.split())

答案 3 :(得分:2)

存储桶名称必须是字符串而不是存储桶对象。 以下变化对我有用

for k in src.list():
    dst.copy_key(k.key, src.name, k.key)

答案 4 :(得分:0)

如果您有2个具有不同访问凭据的不同存储桶。将凭据相应地存储在〜/ .aws文件夹下的凭据和配置文件中。

您可以使用以下命令从具有不同凭据的一个存储桶中复制对象,然后将对象保存在具有不同凭据的另一个存储桶中:

switch (list.type) {
        case 'created': {
            return (
                <Expandable base_height={50}>
                    <List
                        icon={<Svg/>}
                        text={<Text text={' created '} mname={model} 
                        additional_text={some_text}/>}/>
                </Expandable>
            );
         }
         case 'deleted': {
            return (
                <Expandable base_height={50}>
                    <List
                        icon={<Svg/>}
                        text={<Text text={' deleted '} mname={model} 
                        additional_text={some1_text}/>}/>
                </Expandable> 
            );
         }
         case 'udpated':
         {/*ssooooooooooooo on ''''''''''''*/}
         default:
             return null;
 }

两个存储桶都不需要在ACL或存储桶策略中彼此具有可访问性。

答案 5 :(得分:0)

这是我用来在s3存储桶的子目录内移动文件的代码

# =============================================================================
# CODE TO MOVE FILES within subfolders in S3 BUCKET
# =============================================================================

from boto3.session import Session

ACCESS_KEY = 'a_key'
SECRET_KEY = 's_key'
session = Session(aws_access_key_id=ACCESS_KEY,
            aws_secret_access_key=SECRET_KEY)
s3 = session.resource('s3')#creating session of S3 as resource


s3client = session.client('s3')

resp_dw = s3client.list_objects(Bucket='main_bucket', Prefix='sub_folder/', Delimiter="/")

forms2_dw = [x['Key'] for x in resp_dw['Contents'][1:]]#here we got all files list (max limit is 1000 at a time)
reload_no = 0
while len(forms2_dw) != 0 :

    #resp_dw = s3client.list_objects(Bucket='main_bucket', Prefix='sub_folder/', Delimiter="/")
    #with open('dw_bucket.json','w') as f:
    #    resp_dws =str(resp_dw)
       # f.write(json.dumps(resp_dws))
    #forms_dw = [x['Prefix'] for x in resp_dw['CommonPrefixes']] 
    #forms2_dw = [x['Key'] for x in resp_dw['Contents'][1:]]
    #forms2_dw[-1]
    total_files = len(forms2_dw)
    #i=0
    for i in range(total_files):
    #zip_filename='1819.zip'
        foldername = resp_dw['Contents'][1:][i]['LastModified'].strftime('%Y%m%d')#Put your logic here for folder name
        my_bcket   =  'main_bucket'

        my_file_old = resp_dw['Contents'][1:][i]['Key'] #file to be copied path
        zip_filename =my_file_old.split('/')[-1]
        subpath_nw='new_sub_folder/'+foldername+"/"+zip_filename #destination path
        my_file_new = subpath_nw
        # 
        print str(reload_no)+ ':::  copying from====:'+my_file_old+' to :====='+s3_archive_subpath_nw
        #print my_bcket+'/'+my_file_old 

        if zip_filename[-4:] == '.zip':
            s3.Object(my_bcket,my_file_new).copy_from(CopySource=my_bcket+'/'+my_file_old)
            s3.Object(my_bcket,my_file_old).delete()

            print str(i)+' files moved of '+str(total_files)

    resp_dw = s3client.list_objects(Bucket='main_bucket', Prefix='sub-folder/', Delimiter="/")

    forms2_dw = [x['Key'] for x in resp_dw['Contents'][1:]] 
    reload_no +=1