如何使用boto3将S3对象保存到文件

时间:2015-03-31 21:17:59

标签: python amazon-web-services boto boto3

我正在尝试使用适用于AWS的新boto3客户端进行“hello world”。

我使用的用例非常简单:从S3获取对象并将其保存到文件中。

在boto 2.X中我会这样做:

import boto
key = boto.connect_s3().get_bucket('foo').get_key('foo')
key.get_contents_to_filename('/tmp/foo')

在boto 3中。我找不到干净的方法来做同样的事情,所以我手动迭代“Streaming”对象:

import boto3
key = boto3.resource('s3').Object('fooo', 'docker/my-image.tar.gz').get()
with open('/tmp/my-image.tar.gz', 'w') as f:
    chunk = key['Body'].read(1024*8)
    while chunk:
        f.write(chunk)
        chunk = key['Body'].read(1024*8)

import boto3
key = boto3.resource('s3').Object('fooo', 'docker/my-image.tar.gz').get()
with open('/tmp/my-image.tar.gz', 'w') as f:
    for chunk in iter(lambda: key['Body'].read(4096), b''):
        f.write(chunk)

它工作正常。我想知道是否有任何“本机”boto3功能可以执行相同的任务?

7 个答案:

答案 0 :(得分:170)

最近进入Boto3的定制有助于此(除其他事项外)。它目前在低级别S3客户端上公开,可以像这样使用:

s3_client = boto3.client('s3')
open('hello.txt').write('Hello, world!')

# Upload the file to S3
s3_client.upload_file('hello.txt', 'MyBucket', 'hello-remote.txt')

# Download the file from S3
s3_client.download_file('MyBucket', 'hello-remote.txt', 'hello2.txt')
print(open('hello2.txt').read())

这些函数将自动处理读/写文件以及为大文件并行执行分段上传。

答案 1 :(得分:53)

boto3现在有一个比客户端更好的界面:

resource = boto3.resource('s3')
my_bucket = resource.Bucket('MyBucket')
my_bucket.download_file(key, local_filename)

这本身并不比接受的答案中的client好很多(尽管文档说它在重新上传和下载失败方面做得更好)但考虑到资源通常更符合人体工程学(例如,s3 bucketobject资源比客户端方法更好)这使您可以保留资源层而不必下拉。

Resources通常可以以与客户端相同的方式创建,并且它们接受所有或大多数相同的参数,并将它们转发给内部客户端。

答案 2 :(得分:35)

对于那些想要模拟set_contents_from_string类似boto2方法的人,可以尝试

import boto3
from cStringIO import StringIO

s3c = boto3.client('s3')
contents = 'My string to save to S3 object'
target_bucket = 'hello-world.by.vor'
target_file = 'data/hello.txt'
fake_handle = StringIO(contents)

# notice if you do fake_handle.read() it reads like a file handle
s3c.put_object(Bucket=target_bucket, Key=target_file, Body=fake_handle.read())

对于Python3:

在python3中StringIO and cStringIO are gone。使用StringIO导入,例如:

from io import StringIO

支持这两个版本:

try:
   from StringIO import StringIO
except ImportError:
   from io import StringIO

答案 3 :(得分:11)

# Preface: File is json with contents: {'name': 'Android', 'status': 'ERROR'}

import boto3
import io

s3 = boto3.resource(
    's3',
    aws_access_key_id='my_access_id',
    aws_secret_access_key='my_secret_key'
)

obj = s3.Object('my-bucket', 'key-to-file.json')
data = io.BytesIO()
obj.download_fileobj(data)

# object is now a bytes string, Converting it to a dict:
new_dict = json.loads(data.getvalue().decode("utf-8"))

print(new_dict['status']) 
# Should print "Error"

答案 4 :(得分:1)

注意:我假设您已经分别配置了身份验证。下面的代码是从S3存储桶中下载单个对象。

// lots o' imports up here 

class LeftNavigation extends React.Component {
    listButtons = [];
    // this object controls the configuration of the nav links that show on the left side of the template
    navigation = {
        isLoggedIn : [
            {
                icon : the.icon.for.home,
                isFollowedByDivider : false,
                label : the.label.for.home,
                moduleId : the.module.id.for.home,
            },
            {
                icon : the.icon.for.powerOff,
                isFollowedByDivider : true,
                label : the.label.for.logOut,
                moduleId : the.module.id.for.logout,
            },
            {
                icon : the.icon.for.orderedList,
                isFollowedByDivider : false,
                label : the.label.for.lists,
                moduleId : the.module.id.for.lists,
            },
            {
                icon : the.icon.for.roles,
                isFollowedByDivider : false,
                label : the.label.for.roles,
                moduleId : the.module.id.for.roles,
            },
            {
                icon : the.icon.for.teams,
                isFollowedByDivider : false,
                label : the.label.for.teams,
                moduleId : the.module.id.for.teams,
            },
            {
                icon : the.icon.for.users,
                isFollowedByDivider : false,
                label : the.label.for.users,
                moduleId : the.module.id.for.users,
            },
        ],
        isLoggedOut : [
            {
                icon : the.icon.for.home,
                isFollowedByDivider : false,
                label : the.label.for.home,
                moduleId : the.module.id.for.home,
            },
            {
                icon : the.icon.for.powerOff,
                isFollowedByDivider : false,
                label : the.label.for.logIn,
                moduleId : the.module.id.for.login,
            },
            {
                icon : the.icon.for.registered,
                isFollowedByDivider : false,
                label : the.label.for.register,
                moduleId : the.module.id.for.register,
            },
        ],
    };

    populateListButtons() {
        // here we are generating an array of ListButtons that will comprise the left-hand navigation 
        this.listButtons = [];
        let buttonConfigs = [];
        switch (db.getItem(the.db.item.for.isLoggedIn)) {
            case true:
                buttonConfigs = this.navigation.isLoggedIn;
                break;
            case false:
                buttonConfigs = this.navigation.isLoggedOut;
                break;
            default:
                return;
        }
        buttonConfigs.forEach(buttonConfig => {
            let buttonIsEnabled = true;
            let fontAwesomeStyle = {fontSize : the.style.of.onePointFiveEms};
            let listItemStyle = {};
            let textStyle = {};
            switch (buttonConfig.label) {
                case the.label.for.logIn:
                    fontAwesomeStyle[the.style.property.name.of.color] = the.color.for.success;
                    break;
                case the.label.for.logOut:
                    fontAwesomeStyle[the.style.property.name.of.color] = the.color.for.error;
                    break;
                default:
                    if (session.DisplayLayer.state.moduleId === buttonConfig.moduleId) {
                        fontAwesomeStyle[the.style.property.name.of.color] = the.color.for.white.text;
                    } else {
                        fontAwesomeStyle[the.style.property.name.of.color] = the.color.for.headerBar;
                    }
                    break;
            }
            if (session.DisplayLayer.state.moduleId === buttonConfig.moduleId) {
                buttonIsEnabled = false;
                listItemStyle[the.style.property.name.of.backgroundColor] = the.color.for.selectedLeftNavButtonOrange;
                textStyle[the.style.property.name.of.color] = the.color.for.white.text;
            }
            this.listButtons.push(
                <ListItem
                    button={buttonIsEnabled}
                    key={`${buttonConfig.label}-listItem`}
                    // notice that when one of the left nav links is clicked, we are updating the moduleId value in session, 
                    // which dynamically determines which module shows up in the center panel
                    onClick={() => session.DisplayLayer.updateModuleId(buttonConfig.moduleId)}
                    style={listItemStyle}
                >
                    <ListItemIcon>
                        <FontAwesome name={buttonConfig.icon} style={fontAwesomeStyle}/>
                    </ListItemIcon>
                    <TranslatedText english={buttonConfig.label} style={textStyle}/>
                </ListItem>,
            );
            if (buttonConfig.isFollowedByDivider) {
                this.listButtons.push(<Divider key={`${buttonConfig.label}-divider`}/>);
            }
        });
    }

    render() {
        // dynamically generate the array of left nav buttons before rendering the links 
        this.populateListButtons();
        return <List style={{paddingTop : the.style.of.pixels.zero}}>{this.listButtons}</List>;
    }
}

export default LeftNavigation;

答案 5 :(得分:1)

如果您要读取的文件配置与默认配置不同,请直接使用mpu.aws.s3_download(s3path, destination)或复制粘贴的代码:

def s3_download(source, destination,
                exists_strategy='raise',
                profile_name=None):
    """
    Copy a file from an S3 source to a local destination.

    Parameters
    ----------
    source : str
        Path starting with s3://, e.g. 's3://bucket-name/key/foo.bar'
    destination : str
    exists_strategy : {'raise', 'replace', 'abort'}
        What is done when the destination already exists?
    profile_name : str, optional
        AWS profile

    Raises
    ------
    botocore.exceptions.NoCredentialsError
        Botocore is not able to find your credentials. Either specify
        profile_name or add the environment variables AWS_ACCESS_KEY_ID,
        AWS_SECRET_ACCESS_KEY and AWS_SESSION_TOKEN.
        See https://boto3.readthedocs.io/en/latest/guide/configuration.html
    """
    exists_strategies = ['raise', 'replace', 'abort']
    if exists_strategy not in exists_strategies:
        raise ValueError('exists_strategy \'{}\' is not in {}'
                         .format(exists_strategy, exists_strategies))
    session = boto3.Session(profile_name=profile_name)
    s3 = session.resource('s3')
    bucket_name, key = _s3_path_split(source)
    if os.path.isfile(destination):
        if exists_strategy is 'raise':
            raise RuntimeError('File \'{}\' already exists.'
                               .format(destination))
        elif exists_strategy is 'abort':
            return
    s3.Bucket(bucket_name).download_file(key, destination)

答案 6 :(得分:1)

如果要下载文件的版本,则需要使用get_object

import boto3

bucket = 'bucketName'
prefix = 'path/to/file/'
filename = 'fileName.ext'

s3c = boto3.client('s3')
s3r = boto3.resource('s3')

if __name__ == '__main__':
    for version in s3r.Bucket(bucket).object_versions.filter(Prefix=prefix + filename):
        file = version.get()
        version_id = file.get('VersionId')
        obj = s3c.get_object(
            Bucket=bucket,
            Key=prefix + filename,
            VersionId=version_id,
        )
        with open(f"{filename}.{version_id}", 'wb') as f:
            for chunk in obj['Body'].iter_chunks(chunk_size=4096):
                f.write(chunk)

参考:https://botocore.amazonaws.com/v1/documentation/api/latest/reference/response.html