将文件和对象写入amazon s3

时间:2012-09-25 14:44:20

标签: python amazon-s3

我正在使用amazon S3将动态生成的文件分发到S3。

在本地服务器上,我可以使用

destination = open(VIDEO_DIR + newvideo.name, 'wb+')

将生成的视频存储到位置VIDEO_DIR.newvideo.name

是否有可行的方法将VIDEO_DIR更改为S3端点位置。那么动态生成的视频可以直接写入S3服务器吗?

另一个问题是:有没有可行的方法直接将对象写入S3?例如,一个chunklet=Chunklet(),如何直接将这个chunklet对象写入S3服务器?

我可以先创建一个本地文件并使用S3 API。例如,

mime = mimetypes.guess_type(filename)[0]
k = Key(b)
k.key = filename
k.set_metadata("Content-Type", mime)
k.set_contents_from_filename(filename)
k.set_acl('public-read')

但我想提高效率。使用Python。

2 个答案:

答案 0 :(得分:3)

使用boto library访问S3存储空间。您仍然必须首先将数据写入(临时)文件,然后才能发送它,因为流编写方法尚未实现。

我会使用上下文管理器来解决这个限制:

import tempfile
from contextlib import contextmanager

@contextmanager
def s3upload(key):
    with tempfile.SpooledTemporaryFile(max_size=1024*10) as buffer:  # Size in bytes
        yield buffer  # After this, the file is typically written to
        buffer.seek(0)  # So that reading the file starts from its beginning
        key.set_contents_from_file(buffer)

将其用作上下文管理文件对象:

k = Key(b)
k.key = filename
k.set_metadata("Content-Type", mime)

with s3upload(k) as out:
    out.write(chunklet)

答案 1 :(得分:1)

Martijn的解决方案很棒,但它会强制您在上下文管理器中使用该文件(您不能out = s3upload(…)print >> out, "Hello")。以下解决方案的工作方式类似(内存存储直到某个大小),但同时作为上下文管理器和常规文件(您可以同时执行with S3WriteFile(…)out = S3WriteFile(…); print >> out, "Hello"; out.close()):

import tempfile
import os

class S3WriteFile(object):
    """
    File-like context manager that can be written to (and read from),
    and which is automatically copied to Amazon S3 upon closing and deletion.
    """

    def __init__(self, item, max_size=10*1024**2):
        """
        item -- boto.s3.key.Key for writing the file (upon closing). The
        name of the object is set to the item's name (key).

        max_size -- maximum size in bytes of the data that will be
        kept in memory when writing to the file. If more data is
        written, it is automatically rolled over to a file.
        """

        self.item = item

        temp_file = tempfile.SpooledTemporaryFile(max_size)

        # It would be useless to set the .name attribute of the
        # object: when using it as a context manager, the temporary
        # file is returned, which as a None name:
        temp_file.name = os.path.join(
            "s3://{}".format(item.bucket.name),
            item.name if item.name is not None else "<???>")

        self.temp_file = temp_file

    def close(self):
        self.temp_file.seek(0)
        self.item.set_contents_from_file(self.temp_file)
        self.temp_file.close()

    def __del__(self):
        """
        Write the file contents to S3.
        """
        # The file may have been closed before being deleted:
        if not self.temp_file.closed:
            self.close()

    def __enter__(self):
        return self.temp_file

    def __exit__(self, *args, **kwargs):
        self.close()
        return False

    def __getattr__(self, name):
        """
        Everything not specific to this class is delegated to the
        temporary file, so that objects of this class behave like a
        file.
        """
        return getattr(self.temp_file, name)

(实现说明:而不是将许多内容委托给self.temp_file,以便生成的类的行为类似于文件,从SpooledTemporaryFile继承原则上是可行的。但是,这是一个旧式的类,因此,__new__()未被调用,并且,据我所知,无法设置临时数据的非默认内存大小。)