循环遍历Azure中目录的树结构

时间:2016-05-13 14:08:57

标签: c# azure azure-storage azure-storage-blobs

我在Azure Blob存储中有大量的XML文件。文件以树结构保存。有一个名为QA的根目录,然后在QA中我们有多年的子目录[例如:2015年,2016]。 在每年的文件夹中,每个月都有子目录[例如:01,02,03 ... 12]。每个月内都有每天的子目录。我在这些日期文件夹中有这些xml文件。

我编写了一个代码来处理xml文件并将它们保存在同一个位置,但是我写的代码只有在我显示文件存在的确切目录时才有效,它不会循环到内部目录中。如何让它遍历每个子目录和文件。

QA\2015\01\01\file1<datetimestamp>.xml
    -------01\file2<datetimestamp>.xml
    -------01\file3<datetimestamp>.xml
  ------\01\02\file1<datetimestamp>.xml
  ------\01\02\file2<datetimestamp>.xml
  ...
  ...
  ...
  ------\02\01\file1<datetimestamp>.xml
  ...
  ...
  ...
QA\2016\01\01\file1<datetimestamp>.xml
    -------01\file2<datetimestamp>.xml
    -------01\file3<datetimestamp>.xml
  ------\01\02\file1<datetimestamp>.xml
  ------\01\02\file2<datetimestamp>.xml
  ...
  ...
  ...
  ------\02\01\file1<datetimestamp>.xml

我正在寻找一些方法来实现像

这样的代码
for (year )
    {
        for (month)
            {
                for day
                    <My code goes here - pick the files scrub unnecessary data and save as a new file in same location>
                     --- Now I need to fiigure out how to save in same location as well
            }
    }

此外,这对我不起作用:var blobs = container.ListBlobs(prefix: "container-directory", useFlatBlobListing: true);因为我没有列出。我想转到每个目录并处理文件

我的代码:只有错误

     CloudBlobClient bc = sa.CreateCloudBlobClient();

    // Get a reference to the container
    CloudBlobContainer container = bc.GetContainerReference(ContainerNameStr);

    var blobs = bc.ListBlobs(prefix: InitialLocDir, useFlatBlobListing: true, blobListingDetails: BlobListingDetails.Metadata);
    foreach (CloudBlockBlob blob in blobs)
    {
        blob.AcquireLease(TimeSpan.FromSeconds(15), null);
        var blocks = blob.DownloadBlockList(BlockListingFilter.Committed).ToList();
        foreach (var block in blocks)
        {
            MemoryStream sourceStream = new MemoryStream();
            blob.DownloadRangeToStream(sourceStream, 0, block.Length);
            // Modify the stream here
            Gpg gpg1 = new Gpg();
            MemoryStream destStream = new MemoryStream();
            gpg1.Passphrase = Phrase;
            gpg1.BinaryPath = @"C:/Program Files (x86)/GNU/GnuPG/gpg2.exe";
            sourceStream.Position = 0;
            gpg1.Decrypt(sourceStream, destStream);

            destStream.Position = 0;
            StreamReader reader = new StreamReader(destStream);
            string xmlfile = "";
            xmlfile = reader.ReadToEnd();
            blob.PutBlock(block.Name, destStream, null, null, null, null);
            Console.WriteLine(xmlfile);
        }

        Console.Read();

        blob.PutBlockList(blocks, null, null, null);
    }

1 个答案:

答案 0 :(得分:1)

我有完全相同的目录结构,当天每小时都有一个额外的文件夹。我处理每个文件夹中的每个blob并“清理”它们。

一些伪代码

  • blobs = cloudBlobClient.ListBlobs(前缀:rootFolder,useFlatBlobListing:true,blobListingDetails:BlobListingDetails.Metadata);
  • foreach(blob in blobs)
    • 团块。AcquireLease
    • var blocks = blob。DownloadBlockList(BlockListingFilter.Committed).ToList();
    • foreach(块状块)
      • blob.DownloadRangeToStream(数据流,位置,block.Length)
      • 根据需要修改流
      • 如果修改,则blob.PutBlock(block.Name,adjustedStream ...)
  • blob.PutBlockList(blocksToPush,new AccessCondition(){LeaseId = leaseId})

    client.ListBlobs基本上完成了枚举所有文件夹中所有blob的工作。然后,只需迭代每个blob文件并获取其块并进行处理即可。