如何将大文件上传到GCP Cloud Storage?

时间:2018-12-05 09:09:34

标签: java google-cloud-platform google-cloud-storage

我有3GB大小的数据文件要上传到GCP云存储中。我尝试了“ GCP上传对象”教程中的示例。但是当我尝试上传时,出现以下错误。

java.lang.OutOfMemoryError: Required array size too large

我尝试如下,

BlobId blobId = BlobId.of(gcpBucketName, "ft/"+file.getName());
BlobInfo blobInfo = BlobInfo.newBuilder(blobId).setContentType("text/plain").build();
Blob blob = storage.get().create(blobInfo, Files.readAllBytes(Paths.get(file.getAbsolutePath())));
return blob.exists();

我该如何解决?是否可以使用GCP Cloud Storage Java客户端上传大文件?

2 个答案:

答案 0 :(得分:1)

之所以会这样,是因为 Files.readAllBytes 返回的数组具有bigger size than the maximum allowed

您可以采取的解决方法是将文件分成多个字节数组,将它们作为单独的文件上传到存储桶中,然后使用gsutil compose command进行合并。

答案 1 :(得分:1)

存储版本:

  <artifactId>google-cloud-storage</artifactId>
  <version>1.63.0</version>

准备:

            BlobId blobId = BlobId.of(BUCKET_NAME, date.format(BASIC_ISO_DATE) + "/" + prefix + "/" + file.getName());
            BlobInfo blobInfo = BlobInfo.newBuilder(blobId).setContentType("application/gzip").build();
            uploadToStorage(storage, file, blobInfo);

主要方法:

private void uploadToStorage(Storage storage, File uploadFrom, BlobInfo blobInfo) throws IOException {
    // For small files:
    if (uploadFrom.length() < 1_000_000) {
        byte[] bytes = Files.readAllBytes(uploadFrom.toPath());
        storage.create(blobInfo, bytes);
        return;
    }

    // For big files:
    // When content is not available or large (1MB or more) it is recommended to write it in chunks via the blob's channel writer.
    try (WriteChannel writer = storage.writer(blobInfo)) {

        byte[] buffer = new byte[10_240];
        try (InputStream input = Files.newInputStream(uploadFrom.toPath())) {
            int limit;
            while ((limit = input.read(buffer)) >= 0) {
                writer.write(ByteBuffer.wrap(buffer, 0, limit));
            }
        }

    }
}