我在Java环境中尝试将日志文件分块写入Google云端存储。我有一个解析原始日志文件并生成JSON行的进程;我将JSON行存储在缓冲区中,每次缓冲区达到5mg左右时,我想写入GCS中的同一文件,直到原始源完全解析为止。我有类似的设置写入AWS S3。由于内存问题,以块的形式写入。
我设法将文件写入GCS,如下所示(gcsService是配置了身份验证的存储对象,依此类推):
private void uploadStream(String path, String name, String contentType, InputStream stream, String bucketName) throws IOException, GeneralSecurityException {
InputStreamContent contentStream = new InputStreamContent(contentType, stream);
StorageObject objectMetadata = new StorageObject()
.setName(path+"/"+name)
.setAcl(Arrays.asList(new ObjectAccessControl().setEntity("allUsers").setRole("READER")));
Storage.Objects.Insert insertRequest = gcsService.objects()
.insert(bucketName, objectMetadata, contentStream);
insertRequest.execute();
}
不幸的是,我一直无法弄清楚如何以块的形式写入GCS。谷歌的文档似乎提出了两种方法。一个涉及" Resumable"插入请求: https://cloud.google.com/storage/docs/json_api/v1/how-tos/upload
另一种方法涉及"撰写"要求: https://cloud.google.com/storage/docs/json_api/v1/objects/compose
我一直试图获得一个" Resumable"上传设置,但我无法上传。
有什么想法吗?我的具体问题是:
答案 0 :(得分:1)
得到它的工作 - 这是一个麻烦。为了记录,我的问题的答案是:
我最终有两种方法 - 一种用于启动上传,另一种用于发送数据块。
private String initiateResumableUpload() throws IOException {
String URI = "https://storage.googleapis.com/" + bucket + "/" + path;
GenericUrl url = new GenericUrl(URI);
HttpRequest req = requestFactory.buildPostRequest(url, new ByteArrayContent("text/plain", new byte[0]));
HttpHeaders headers = new HttpHeaders();
headers.set("x-goog-resumable", "start");
headers.setContentLength((long) 0);
headers.setContentType("text/plain");
req.setHeaders(headers);
req.setReadTimeout((int) DEFAULT_TIMEOUT);
req.setResponseHeaders(headers);
HttpResponse resp;
try {
resp = req.execute();
} catch (IOException e) {
throw e;
}
if (resp.getStatusCode() == 201) {
String location = resp.getHeaders().getLocation();
return location;
} else {
throw new IOException();
}
}
requestFactory应该知道您正确生成的凭据。
private void writeChunk(final boolean isFinalChunk) throws HttpResponseException, IOException {
System.out.println("Writing chunk number " + Integer.toString(chunkCount) + ".");
try (InputStream inputStream = new ByteBufInputStream(buffer)) {
int length = Math.min(buffer.readableBytes(), DEFAULT_UPLOAD_CHUNK_SIZE);
HttpContent contentsend = new InputStreamContent("text/plain", inputStream);
String URI = location;
GenericUrl url = new GenericUrl(URI);
HttpRequest req = requestFactory.buildPutRequest(url, contentsend);
int offset = chunkCount*DEFAULT_UPLOAD_CHUNK_SIZE;
long limit = offset + length;
HttpHeaders headers = new HttpHeaders();
headers.setContentLength((long) length);
headers.setContentRange("bytes " + (length == 0 ? "*" : offset + "-" + (limit - 1)) + (isFinalChunk ? "/" + limit : "/*"));
req.setHeaders(headers);
req.setReadTimeout((int) DEFAULT_TIMEOUT);
try {
req.execute();
}
catch (HttpResponseException e) {
if(e.getMessage().equals("308 Resume Incomplete"))
{
++chunkCount;
}
else
{
throw e;
}
}
catch (Exception e) {
throw e;
}
}
}
我的缓冲区是io.netty.buffer.ByteBuf。
我与GCS相关的导入是:
import com.google.api.client.googleapis.auth.oauth2.GoogleCredential;
import com.google.api.client.googleapis.javanet.GoogleNetHttpTransport;
import com.google.api.client.http.ByteArrayContent;
import com.google.api.client.http.GenericUrl;
import com.google.api.client.http.HttpContent;
import com.google.api.client.http.HttpHeaders;
import com.google.api.client.http.HttpRequest;
import com.google.api.client.http.HttpRequestFactory;
import com.google.api.client.http.HttpResponse;
import com.google.api.client.http.HttpResponseException;
import com.google.api.client.http.HttpTransport;
上面的代码中可能存在一些错误,但它确实成功地将一个文件写入GCS中。
我还设法通过不同的库和“撰写”请求完成任务。但“可恢复”方法似乎更合适。
干杯,祝你好运。