使用aws-sdk-go在不创建文件的情况下将对象上载到AWS S3

时间:2017-12-03 18:34:05

标签: go amazon-s3 aws-sdk-go

我正在尝试使用golang sdk将对象上传到AWS S3,而无需在我的系统中创建文件(尝试仅上传字符串)。但我很难实现这一目标。任何人都可以举例说明如何在不需要创建文件的情况下上传到AWS S3?

AWS如何上传文件的示例:

@Document(collection = "prospects")
public class Prospect {
  @Id private String prospectId;

  @TextIndexed() private String businessName;

  public String getProspectId() {
    return prospectId;
  }

  public void setProspectId(String prospectId) {
    this.prospectId = prospectId;
  }

  public String getBusinessName() {
    return businessName;
  }

  public void setBusinessName(final String businessName) {
    this.businessName = businessName;
  }
}    

我已经尝试以编程方式创建文件,但它正在我的系统上创建文件,然后将其上传到S3。

4 个答案:

答案 0 :(得分:3)

UploadInput结构的Body字段只是io.Reader。所以传递你想要的任何io.Reader - 它不需要是一个文件。

答案 1 :(得分:0)

在此答案中,我将发布与该问题相关的所有对我有用的内容。非常感谢@ThunderCat和@Flimzy,它们使我意识到上载请求的主体参数已经是io.Reader。我将发布一些示例代码,以评论从该问题中学到的内容以及它如何帮助我解决该问题。也许这会对像我和@AlokKumarSingh这样的人有所帮助。

情况1:您已经在内存中存储了数据(例如,从Kafka,Kinesis或SQS等流/消息服务接收数据)

func main() {
    if len(os.Args) != 3 {
        fmt.Printf(
            "bucket and file name required\nUsage: %s bucket_name filename",
            os.Args[0],
        )
    }

    bucket := os.Args[1]
    filename := os.Args[2]

    // this is your data that you have in memory
    // in this example it is hard coded but it may come from very distinct
    // sources, like streaming services for example.
    data := "Hello, world!"

    // create a reader from data data in memory
    reader := strings.NewReader(data)

    sess, err := session.NewSession(&aws.Config{
        Region: aws.String("us-east-1")},
    )
    uploader := s3manager.NewUploader(sess)

    _, err = uploader.Upload(&s3manager.UploadInput{
        Bucket: aws.String(bucket),
        Key: aws.String(filename),
        // here you pass your reader
        // the aws sdk will manage all the memory and file reading for you
        Body: reader,
    })
    if err != nil {.
        fmt.Printf("Unable to upload %q to %q, %v", filename, bucket, err)
    }

    fmt.Printf("Successfully uploaded %q to %q\n", filename, bucket)
}

情况2:您已经有一个永久文件,并且想要上传它,但是您不想将整个文件保存在内存中:

func main() {
    if len(os.Args) != 3 {
        fmt.Printf(
            "bucket and file name required\nUsage: %s bucket_name filename",
            os.Args[0],
        )
    }

    bucket := os.Args[1]
    filename := os.Args[2]

    // open your file
    // the trick here is that the method os.Open just returns for you a reader
    // for the desired file, so you will not maintain the whole file in memory.
    // I know this might sound obvious, but for a starter (as I was at the time
    // of the question) it is not.
    fileReader, err := os.Open(filename)
    if err != nil {
        fmt.Printf("Unable to open file %q, %v", err)
    }
    defer fileReader.Close()

    sess, err := session.NewSession(&aws.Config{
        Region: aws.String("us-east-1")},
    )
    uploader := s3manager.NewUploader(sess)

    _, err = uploader.Upload(&s3manager.UploadInput{
        Bucket: aws.String(bucket),
        Key:    aws.String(filename),
        // here you pass your reader
        // the aws sdk will manage all the memory and file reading for you
        Body: fileReader,
    })
    if err != nil {
        fmt.Printf("Unable to upload %q to %q, %v", filename, bucket, err)
    }

    fmt.Printf("Successfully uploaded %q to %q\n", filename, bucket)
}

案例3:这是我在系统的最终版本中实现它的方法,但是要理解为什么这样做,我必须给您一些背景知识。

我的用例有所发展。上传代码将成为Lambda的功能,并且文件很大。这意味着什么:如果我通过连接到Lambda函数的API网关中的入口点上传文件,则必须等待整个文件在Lambda中完成上传。由于lambda是根据调用的持续时间和内存使用量定价的,因此这可能是一个很大的问题。

因此,为了解决此问题,我使用了预签名的帖子URL进行上传。这如何影响架构/工作流程?

我不是从后端代码上载到S3,而是创建并验证将对象发布到后端中的S3的URL并将其发送到前端。这样,我就实现了对该URL的分段上传。我知道这比问题要具体得多,但是要找到这种解决方案并不容易,因此我认为在这里为其他人提供文档是个好主意。

这里是如何在 nodejs 中创建该预签名URL的示例。

const AWS = require('aws-sdk');

module.exports.upload = async (event, context, callback) => {

  const s3 = new AWS.S3({ signatureVersion: 'v4' });
  const body = JSON.parse(event.body);

  const params = {
    Bucket: process.env.FILES_BUCKET_NAME,
    Fields: {
      key: body.filename,
    },
    Expires: 60 * 60
  }

  let promise = new Promise((resolve, reject) => {
    s3.createPresignedPost(params, (err, data) => {
      if (err) {
        reject(err);
      } else {
        resolve(data);
      }
    });
  })

  return await promise
    .then((data) => {
      return {
        statusCode: 200,
        body: JSON.stringify({
          message: 'Successfully created a pre-signed post url.',
          data: data,
        })
      }
    })
    .catch((err) => {
      return {
        statusCode: 400,
        body: JSON.stringify({
          message: 'An error occurred while trying to create a pre-signed post url',
          error: err,
        })
      }
    });
};

如果您想使用 go ,这是相同的想法,则只需更改de sdk。

答案 2 :(得分:0)

这是我写的一个利用管道并结合超时的小实现。

package example

import (
    "context"
    "fmt"
    "io"
    "sync"
    "time"

    "github.com/aws/aws-sdk-go/service/s3/s3manager"
)

func FileWriter(ctx context.Context, uploader *s3manager.Uploader, wg *sync.WaitGroup, bucket string, key string, timeout time.Duration) (writer *io.PipeWriter) {
    // create a per-file flush timeout
    fileCtx, cancel := context.WithTimeout(ctx, timeout)

    // pipes are open until one end is closed
    pr, pw := io.Pipe()

    wg.Add(1)
    go func() {
        params := &s3manager.UploadInput{
            Bucket: aws.String(bucket),
            Key:    aws.String(key),
            Body:   pr,
        }

        // blocking
        _, err := uploader.Upload(params)
        if err != nil {
            fmt.Printf("Unable to upload, %v. Bucket: %s", err, bucket)
        }

        // always call context cancel functions!
        cancel()
        wg.Done()
    }()

    // when context is cancelled, close the pipe
    go func() {
        <-fileCtx.Done()
        // should check fileCtx.Err() here
        if err := pw.Close(); err != nil {
            fmt.Printf("Unable to close")
        }
    }()

    return pw
}

答案 3 :(得分:-1)

这就是我最后写的东西

func (s *S3Sink) upload() {
    now := time.Now()
    key := s.getNewKey(now)

    _, err := s.uploader.Upload(&s3manager.UploadInput{
        Bucket: aws.String(s.bucket),
        Key:    aws.String(key),
        Body:   s.bodyBuf,
    })

    if err != nil {
        glog.Errorf("Error uploading %s to s3, %v", key, err)
    }
    glog.Infof("Uploaded at %s", key)
    s.lastUploadTimestamp = now.UnixNano()

    s.bodyBuf.Truncate(0)
}

下面有更多详细信息: https://github.com/heptiolabs/eventrouter/blob/20edca33bc6e20465810d49bdb213119464eb440/sinks/s3sink.go#L185-L201