AWS CPP TransferManager与GetObjectRequest流到文件fstream OOM

时间:2017-05-19 16:49:08

标签: c++ amazon-web-services aws-sdk

我正在使用AWS CPP SDK(https://github.com/aws/aws-iot-device-sdk-cpp)从小型Linux系统(仅32 MB RAM)上的S3下载文件。我正在使用GetObjectRequest类,如下所示。它工作得很好,并将文件下载到我系统上的FStream中,因此它不会占用太多RAM。

现在,我想将下载代码转换为TransferManager方法以获得进度回调。我已经重写了部分代码,它也在下面显示。它开始很好,打印下载的百分比,但当它达到~14 MB RAM(大约是下载时在Linux中可用的数量)时,它被内核杀死使用太多RAM。

我创建了一个文件流,就像我为GetObjectRequest做的那样。我究竟做错了什么?我怎样才能解决这个问题?感谢。

没有使用所有RAM的旧方式:

    // Old way
    GetObjectRequest getObjectRequest;
    getObjectRequest.SetBucket(bucket.c_str());
    getObjectRequest.SetKey(keyName.c_str());
    getObjectRequest.SetResponseStreamFactory([&destination](){
     return Aws::New<Aws::FStream>(
     "s3file", destination, std::ios_base::out); });

    GetObjectOutcome getObjectOutcome = SessionClient->GetObject(getObjectRequest);
    if(getObjectOutcome.IsSuccess())
    {
        std::cout << "<AWS DOWNLOAD> Get FW success!" << std::endl;
    }
    else
    {
        std::cout << "<AWS DOWNLOAD> Get FW failed: " << getObjectOutcome.GetError().GetMessage() << std::endl;
        exit(1);
    }

最终使用太多RAM并被内核杀死的新方式:

// New way
Aws::Transfer::TransferManagerConfiguration transferConfig;
transferConfig.s3Client = SessionClient;

std::shared_ptr<Aws::Transfer::TransferHandle> requestPtr(nullptr);

transferConfig.downloadProgressCallback =
        [](const Aws::Transfer::TransferManager*, const Aws::Transfer::TransferHandle& handle)
{
    std::cout << "\r" << "<AWS DOWNLOAD> Download Progress: " << static_cast<int>(handle.GetBytesTransferred() * 100.0 / handle.GetBytesTotalSize()) << " Percent " << handle.GetBytesTransferred() << " bytes\n";
};

Aws::Transfer::TransferManager transferManager(transferConfig);

requestPtr = transferManager.DownloadFile(bucket.c_str(), keyName.c_str(), [&destination](){

    Aws::FStream *stream = Aws::New<Aws::FStream>("s3file", destination, std::ios_base::out);
    stream->rdbuf()->pubsetbuf(NULL, 0);

    return stream; });

requestPtr->WaitUntilFinished();

size_t retries = 0;
//just make sure we don't fail because a download part failed. (e.g. network problems or interuptions)
while (requestPtr->GetStatus() == Aws::Transfer::TransferStatus::FAILED && retries++ < 5)
{
    std::cout << "<AWS DOWNLOAD> FW Download trying download again!" << std::endl;
    transferManager.RetryDownload(requestPtr);
    requestPtr->WaitUntilFinished();
}

// Check status
if ( requestPtr->GetStatus() == Aws::Transfer::TransferStatus::COMPLETED ) {
    if ( requestPtr->GetBytesTotalSize() == requestPtr->GetBytesTransferred() ) {
        std::cout << "<AWS DOWNLOAD> Get FW success!" << std::endl;
        exit(0);
    }
    else {
        std::cout << "<AWS DOWNLOAD> Get FW failed - Bytes downloaded did not equal requested number of bytes: " << requestPtr->GetBytesTotalSize() << requestPtr->GetBytesTransferred() << std::endl;
        exit(1);
    }
}
else {
    std::cout << "<AWS DOWNLOAD> Get FW failed - download was never completed even after retries" << std::endl;
    exit(1);
}

1 个答案:

答案 0 :(得分:1)

只有在10mb或更大的土地上并且想要利用并行化时,TransferManager才能让事情变得更容易。它将预先分配最大堆大小,而不是增大堆大于此值。鉴于你的RAM限制,我不会使用TransferManager。您仍然可以收到进度通知。检查AmazonWebServiceRequest类中的回调机制。