我需要使用相同Azure Azure Blob存储上另一个容器上的文件在我的Azure Blob存储上创建一个大型zip文件。文件数量可以是数十万,也可以是几千兆字节的数据。
使用WindowsAzure.Storage和SharpZipLib,我编写了以下代码:
<package id="WindowsAzure.Storage" version="7.2.1" targetFramework="net461" />
<package id="SharpZipLib" version="1.0.0" targetFramework="net461" />
-
CloudBlobContainer backupContainer = _myservice.GetBlobBackupContainer();
tempZipBlockBlob = container.GetBlockBlobReference(Path.Combine(MyBlobStoragePaths.BackupsTempFolder, backupId));
tempZipBlockBlob.DeleteIfExists();
using (CloudBlobStream blobStream = tempZipBlockBlob.OpenWrite(AccessCondition.GenerateIfNotExistsCondition()))
{
using (var zipOutputStream = new ZipOutputStream(blobStream))
{
zipOutputStream.SetLevel(0);
foreach (CloudBlockBlob cloudBlob in snapshots)
{
string blobName = Path.Combine(prefix, cloudBlob.Name.Replace('/', '\\'));
try
{
var buffer = new byte[4096];
using (Stream readStream = cloudBlob.OpenRead())
{
var zipEntry = new ZipEntry(ZipEntry.CleanName(blobName)) {Size = readStream.Length};
zipOutputStream.PutNextEntry(zipEntry);
StreamUtils.Copy(readStream, zipOutputStream, buffer);
}
zipOutputStream.Flush();
}
catch (Exception e)
{
Log.Error(e, "CopySnapshotToZipOutputStream failed for {blobName}.", blobName);
throw;
}
}
}
}
该代码在我的开发人员中和许多容器上的测试中都可以正常运行,但对于1个失败。
Microsoft.WindowsAzure.Storage.StorageException:远程服务器返回错误:(413)请求正文太大,超过了最大允许的限制。---> System.Net.WebException:远程服务器返回了错误错误:(413)请求正文太大,超出了最大允许限制。 [10/17/2018 14:55:11> 9d4928:INFO]在c的T Microsoft.WindowsAzure.Storage.Shared.Protocol.HttpResponseParsers.ProcessExpectedStatusCodeNoException(HttpStatusCode expectStatusCode,HttpStatusCode actualStatusCode,T retVal,StorageCommandBase cmd,Exception ex)在c: \ Program Files(x86)\ Jenkins \ workspace \ release_dotnet_master \ Lib \ Common \ Shared \ Protocol \ HttpResponseParsers.Common.cs:line 50 [10/17/2018 14:55:11> 9d4928:INFO]在RESTCommand上Microsoft.WindowsAzure.Storage.Blob.CloudBlockBlob.PutBlockImpl(流源,字符串blockId,字符串contentMD5,AccessCondition accessCondition,BlobRequestOptions选项)+(RESTCommand cmd, HttpWebResponse响应,Exception ex,OperationContext ctx)=> {}在c:\ Program Files(x86)\ Jenkins \ workspace \ release_dotnet_master \ Lib \ ClassLibraryCommon \ Blob \ CloudBlockBlob.cs:line 2380 [10/17/2018 14:55:11> 9d4928:INFO]在Microsoft c:\ Program Files(x86)\ Jenkins \ workspace \ release_dotnet_master中无效的Microsoft.WindowsAzure.Storage.Core.Executor.Executor.EndGetResponse(IAsyncResult getResponseResult) \ Lib \ ClassLibraryCommon \ Core \ Executor \ Executor.cs:第299行 [10/17/2018 14:55:11> 9d4928:INFO]-内部异常堆栈跟踪的结尾- [10/17/2018 14:55:11> 9d4928:INFO]在void Microsoft.WindowsAzure.Storage.Blob.BlobWriteStream.Flush()在c:\ Program Files(x86)\ Jenkins \ workspace \ release_dotnet_master \ Lib \ ClassLibraryCommon中\ Blob \ BlobWriteStream.cs:第234行 [10/17/2018 14:55:11> 9d4928:INFO]在空白处ICSharpCode.SharpZipLib.Zip.Compression.Streams.DeflaterOutputStream.Finish() [10/17/2018 14:55:11> 9d4928:INFO]在无效的ICSharpCode.SharpZipLib.Zip.ZipOutputStream.CloseEntry() [10/17/2018 14:55:11> 9d4928:INFO]在无效的ICSharpCode.SharpZipLib.Zip.ZipOutputStream.PutNextEntry(ZipEntry条目)
我已经检查了客户端,并且有更多日志。我要压缩的文件总大小仅为 3.7 GB ,而我要压缩的文件只有〜68,000个文件。
对于仅 4,16KB 的小jpg文件失败:\,并且在zip流中正确传递了超过 49,000个文件。
此外,使用相同的WindowsAzure.Storage SDK,我可以轻松地在计算机上下载Blob,然后将其压缩为本地zip文件,然后使用UploadBlob方法。
任何想法可能是问题的原因吗?
我可以看到每个Blob的上限为50,000个块。我的文件很小,所以我不希望它们占用1个块,但是当我调用zipOutputStream.PutNextEntry(zipEntry);
时是否有可能打开一个新块?
如果是这种情况,我该如何解决?