s3分段上传总是在第二部分失败,并且超时

时间:2019-02-07 12:14:18

标签: java amazon-s3 kotlin

我正在尝试使用基于the documentation的amazon s3客户端,在Kotlin中获得简单的概念分段上传工作。第一部分成功上传,我得到一个带有etag的回复。第二部分没有上传任何东西并且超时。在第一部分之后,它总是失败。是否需要以某种方式手动进行一些连接清理?

凭据和权限都很好。下面的魔术数字仅是为了达到5MB的最小大小。

我在这里做什么错了?

fun main() {
    val amazonS3 =
        AmazonS3ClientBuilder.standard().withRegion(Regions.EU_WEST_1).withCredentials(ProfileCredentialsProvider())
            .build()


    val bucket = "io.inbot.sandbox"
    val key = "test.txt"
    val multipartUpload =
        amazonS3.initiateMultipartUpload(InitiateMultipartUploadRequest(bucket, key))

    var pn=1
    var off=0L
    val etags = mutableListOf<PartETag>()

    for( i in 0.rangeTo(5)) {

        val buf = ByteArrayOutputStream()
        val writer = buf.writer().buffered()
        for(l in 0.rangeTo(100000)) {
            writer.write("part $i - Hello world for the $l'th time this part.\n")
        }
        writer.flush()
        writer.close()

        val bytes = buf.toByteArray()


        val md = MessageDigest.getInstance("MD5")
        md.update(bytes)
        val md5 = Base64.encodeBytes(md.digest())
        println("going to write ${bytes.size}")
        bytes.inputStream()
        var partRequest = UploadPartRequest().withBucketName(bucket).withKey(key)
            .withUploadId(multipartUpload.uploadId)
            .withFileOffset(off)
            .withPartSize(bytes.size.toLong())
            .withPartNumber(pn++)
            .withMD5Digest(md5)
            .withInputStream(bytes.inputStream())
            .withGeneralProgressListener<UploadPartRequest> { it ->
                println(it.bytesTransferred)
            }
        if(i == 5) {
            partRequest = partRequest.withLastPart(true)
        }

        off+=bytes.size

        val partResponse = amazonS3.uploadPart(partRequest)

        etags.add(partResponse.partETag)
        println("part ${partResponse.partNumber} ${partResponse.eTag} ${bytes.size}")

    }
    val completeMultipartUpload =
        amazonS3.completeMultipartUpload(CompleteMultipartUploadRequest(bucket, key, multipartUpload.uploadId, etags))


}

这总是在第二部分失败

Exception in thread "main" com.amazonaws.services.s3.model.AmazonS3Exception: Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed. (Service: Amazon S3; Status Code: 400; Error Code: RequestTimeout; Request ID: F419872A24BB5526; S3 Extended Request ID: 48XWljQNuOH6LJG9Z85NJOGVy4iv/ru44Ai8hxEP+P+nqHECXZwWNwBoMyjiQfxKpr6icGFjxYc=), S3 Extended Request ID: 48XWljQNuOH6LJG9Z85NJOGVy4iv/ru44Ai8hxEP+P+nqHECXZwWNwBoMyjiQfxKpr6icGFjxYc=
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1630)

只是为了抢占一些我不想要的答案,我的目的不是要上传文件,而是最终可以通过简单地上传部分直到完成然后组合它们来将任意长度的流传输到s3。因此,我不能真正使用TransferManager,因为这需要我提前知道大小,而我不会。另外,将其缓存为文件也不是我想要做的事情,因为它将在dockerized服务器应用程序中运行。所以我真的想上传任意数量的部分。我很乐意按顺序进行;尽管我不介意并行性。

我还使用了“ com.github.alexmojaki:s3-stream-upload:1.0.1”,但是这似乎在内存中保留了很多状态(我已经花了几次),所以我想用更简单的东西代替它。

更新。感谢ilya在下面的评论中。删除withFileOffset可以解决问题。

1 个答案:

答案 0 :(得分:1)

删除withFileOffset可以解决问题。感谢@Ilya指出这一点。

这是我实现的一个简单的输出流,实际上是有效的。

package io.inbot.aws

import com.amazonaws.auth.profile.ProfileCredentialsProvider
import com.amazonaws.regions.Regions
import com.amazonaws.services.s3.AmazonS3
import com.amazonaws.services.s3.AmazonS3ClientBuilder
import com.amazonaws.services.s3.model.CompleteMultipartUploadRequest
import com.amazonaws.services.s3.model.InitiateMultipartUploadRequest
import com.amazonaws.services.s3.model.InitiateMultipartUploadResult
import com.amazonaws.services.s3.model.PartETag
import com.amazonaws.services.s3.model.UploadPartRequest
import mu.KotlinLogging
import java.io.ByteArrayOutputStream
import java.io.OutputStream
import java.security.MessageDigest
import java.util.Base64

private val logger = KotlinLogging.logger {  }
class S3Writer(
    private val amazonS3: AmazonS3,
    private val bucket: String,
    private val key: String,
    private val threshold: Int = 5*1024*1024
) : OutputStream(), AutoCloseable {

    private val etags: MutableList<PartETag> = mutableListOf()

    private val multipartUpload: InitiateMultipartUploadResult = this.amazonS3.initiateMultipartUpload(InitiateMultipartUploadRequest(bucket, key))

    private val currentPart = ByteArrayOutputStream(threshold)

    private var partNumber = 1

    override fun write(b: Int) {
        currentPart.write(b)
        if(currentPart.size() > threshold) {
            sendPart()
        }
    }

    private fun sendPart(last: Boolean = false) {
        logger.info { "sending part $partNumber" }
        currentPart.flush()

        val bytes = currentPart.toByteArray()

        val md = MessageDigest.getInstance("MD5")
        md.update(bytes)
        val md5 = Base64.getEncoder().encode(md.digest())
        var partRequest = UploadPartRequest().withBucketName(bucket).withKey(key)
            .withUploadId(multipartUpload.uploadId)
            .withPartSize(currentPart.size().toLong())
            .withPartNumber(partNumber++)
            .withMD5Digest(md5.contentToString())
            .withInputStream(bytes.inputStream())

        if(last) {
            logger.info { "final part" }
            partRequest = partRequest.withLastPart(true)
        }

        val partResponse = amazonS3.uploadPart(partRequest)

        etags.add(partResponse.partETag)

        currentPart.reset()

    }


    override fun close() {
        if(currentPart.size() > 0) {
            sendPart(true)
        }
        logger.info { "completing" }
        amazonS3.completeMultipartUpload(CompleteMultipartUploadRequest(bucket, key, multipartUpload.uploadId, etags))
    }

}


fun main() {
    val amazonS3 =
        AmazonS3ClientBuilder.standard().withRegion(Regions.EU_WEST_1).withCredentials(ProfileCredentialsProvider())
            .build()

    val bucket = "io.inbot.sandbox"
    val key = "test.txt"

    try {
        S3Writer(amazonS3, bucket, key).use {
            val w = it.bufferedWriter()
            for (i in 0.rangeTo(1000000)) {
                w.write("Line $i: hello again ...\n")
            }
        }
    } catch (e: Throwable) {
        logger.error(e.message,e)
    }
}