使用python代码从s3存储桶下载上次上传的多个文件或今天上传的文件

时间:2017-08-11 06:30:05

标签: python amazon-web-services amazon-s3 boto devops

我正在尝试使用python代码从s3存储桶下载上次上传的多个文件或今天上传的文件。使用下面的代码不希望下载所有文件?

#!/usr/bin/env python

import boto
import sys, os
from boto.s3.key import Key
from boto.exception import S3ResponseError

DOWNLOAD_LOCATION_PATH = os.path.expanduser("~") + "/s3-backup/"
if not os.path.exists(DOWNLOAD_LOCATION_PATH):
    print ("Making download directory")
    os.mkdir(DOWNLOAD_LOCATION_PATH)


def backup_s3_folder():
    BUCKET_NAME = "xxxx"
    AWS_ACCESS_KEY_ID= os.getenv("xxxxx") # set your AWS_KEY_ID  on your environment path
    AWS_ACCESS_SECRET_KEY = os.getenv("xxxxxx") # set your AWS_ACCESS_KEY  on your environment path
    conn  = boto.connect_s3(AWS_ACCESS_KEY_ID, AWS_ACCESS_SECRET_KEY)
    bucket = conn.get_bucket(BUCKET_NAME)

    #goto through the list of files
    bucket_list = bucket.list()

    for l in bucket_list:
        key_string = str(l.key)
        s3_path = DOWNLOAD_LOCATION_PATH + key_string
        try:
            print ("Current File is ", s3_path)
            l.get_contents_to_filename(s3_path)
        except (OSError,S3ResponseError) as e:
            pass
            # check if the file has been downloaded locally
            if not os.path.exists(s3_path):
                try:
                    os.makedirs(s3_path)
                except OSError as exc:
                    # let guard againts race conditions
                    import errno
                    if exc.errno != errno.EEXIST:
                        raise
if __name__ == '__main__':
    backup_s3_folder()

1 个答案:

答案 0 :(得分:0)

据我所知,您只想检索已上传的文件"今天"到今天我的意思是在同一天根据操作系统上的日期时间执行此代码。你可以通过在foreach语句中添加一个条件来计算每个元素的最后修改属性来实现这一点,所以它会沿着这些方向发展:

#!/usr/bin/env python

import boto
import sys, os
from boto.s3.key import Key
from boto.exception import S3ResponseError

#Get today's date
date_today = datetime.datetime.today().replace(hour=0, minute=0, second=0, microsecond=0)

DOWNLOAD_LOCATION_PATH = os.path.expanduser("~") + "/s3-backup/"
if not os.path.exists(DOWNLOAD_LOCATION_PATH):
    print ("Making download directory")
    os.mkdir(DOWNLOAD_LOCATION_PATH)
def backup_s3_folder():
    BUCKET_NAME = "xxxx"
    AWS_ACCESS_KEY_ID= os.getenv("xxxxx") # set your AWS_KEY_ID  on your environment path
    AWS_ACCESS_SECRET_KEY = os.getenv("xxxxxx") # set your AWS_ACCESS_KEY  on your environment path
    conn  = boto.connect_s3(AWS_ACCESS_KEY_ID, AWS_ACCESS_SECRET_KEY)
    bucket = conn.get_bucket(BUCKET_NAME)

    #goto through the list of files
    bucket_list = bucket.list()

    for l in bucket_list:
        key_string = str(l.key)
        s3_path = DOWNLOAD_LOCATION_PATH + key_string
        try:
            if boto.utils.parse_ts(l.last_modified) >  date_today:
                print ("Current File is ", s3_path)
                l.get_contents_to_filename(s3_path)
        except (OSError,S3ResponseError) as e:
            pass
            # check if the file has been downloaded locally
            if not os.path.exists(s3_path):
                try:
                    os.makedirs(s3_path)
                except OSError as exc:
                    # let guard againts race conditions
                    import errno
                    if exc.errno != errno.EEXIST:
                        raise
if __name__ == '__main__':
    backup_s3_folder()

所有这一切都是今天午夜的日期,并检查s3文件的最后修改属性是否大于(因此文件已上传/修改"今天") 。

请记住,当您检索上次修改的属性时,可能需要对时区进行一些处理,请确保将其转换为运行此脚本的系统所在的时区!