在写入阅读器时,分段上传到s3

时间:2018-02-06 21:01:21

标签: go amazon-s3

我发现了一些类似于我的问题,但没有任何问题可以回答我的具体问题。

我想将CSV数据上传到s3。我的基本代码是这样的(为了简洁起见,我简化了数据的获取,通常是从数据库中读取数据):

reader, writer := io.Pipe()

go func() {
    cWriter := csv.NewWriter(pWriter)

    for _, line := range lines {
        cWriter.Write(line)
    }

    cWriter.Flush()
    writer.Close()
}()

sess := session.New(//...)
uploader := s3manager.NewUploader(sess)
    result, err := uploader.Upload(&s3manager.UploadInput{
        Body:   reader,
        //...
})

我理解它的方式,代码将等待写入完成,然后将内容上传到s3,所以我最终得到了内存中文件的全部内容。是否可以对上传进行分块(可能使用s3分段上传?),以便对于较大的上传,我只在任何时候将部分数据存储在内存中?

1 个答案:

答案 0 :(得分:0)

如果我以正确的方式阅读了上传者的源代码,则支持上传者支持分段上传:https://github.com/aws/aws-sdk-go/blob/master/service/s3/s3manager/upload.go

上传部分的最小尺寸为5 Mb。

// MaxUploadParts is the maximum allowed number of parts in a multi-part upload
// on Amazon S3.
const MaxUploadParts = 10000
// MinUploadPartSize is the minimum allowed part size when uploading a part to
// Amazon S3.
const MinUploadPartSize int64 = 1024 * 1024 * 5
// DefaultUploadPartSize is the default part size to buffer chunks of a
// payload into.
const DefaultUploadPartSize = MinUploadPartSize

u := &Uploader{
    PartSize:          DefaultUploadPartSize,
    MaxUploadParts:    MaxUploadParts,
    .......
}

func (u Uploader) UploadWithContext(ctx aws.Context, input *UploadInput, opts ...func(*Uploader)) (*UploadOutput, error) {
   i := uploader{in: input, cfg: u, ctx: ctx}
   .......

func (u *uploader) nextReader() (io.ReadSeeker, int, error) {
    .............
    switch r := u.in.Body.(type) {
    .........
    default:
        part := make([]byte, u.cfg.PartSize)
        n, err := readFillBuf(r, part)
        u.readerPos += int64(n)

        return bytes.NewReader(part[0:n]), n, err
    }
}