Azure Data Lake:将数据从Blob移动到ADLS时遇到问题

时间:2018-03-10 18:46:04

标签: c# azure azure-functions azure-data-lake azure-blob-storage

我正在C#中创建一个Azure函数,它可以执行以下操作:

  • 从blob中提取压缩文件
  • 解压缩并将其复制到Azure Data Lake Store。

我能够解压缩文件并使用UploadFromStreamAsync(stream)函数将其上传到另一个blob中。

但是,我在为ADLS做同样的事情时面临问题

我参考了以下链接Upload to ADLS from file stream并尝试首先使用adlsFileSystemClient.FileSystem.Create创建文件,然后在数据湖中使用adlsFileSystemClient.FileSystem.Append附加流,但它无效。   - create方法创建一个零字节文件,但附加不执行任何操作,并且azure函数仍然成功完成而没有任何错误。此外,尝试使用adlsFileSystemClient.FileSystem.AppendAsync但问题仍然存在。

代码:

 // Save blob(zip file) contents to a Memory Stream.
    using (var zipBlobFileStream = new MemoryStream())
    {
        await blockBlob.DownloadToStreamAsync(zipBlobFileStream);
        await zipBlobFileStream.FlushAsync(); 
        zipBlobFileStream.Position = 0;

        //use ZipArchive from System.IO.Compression to extract all the files from zip file
        using (var zip = new ZipArchive(zipBlobFileStream))
        {
        //Each entry here represents an individual file or a folder
        foreach (var entry in zip.Entries)
        {
            string destfilename = $"{destcontanierPath2}/"+entry.FullName;
            log.Info($"DestFilename: {destfilename}");
            //creating an empty file (blobkBlob) for the actual file with the same name of file
            var blob = extractcontainer.GetBlockBlobReference($"{destfilename}");
            using (var stream = entry.Open())
            {
                //check for file or folder and update the above blob reference with actual content from stream
                if (entry.Length > 0)
                {
                    await blob.UploadFromStreamAsync(stream);

                    //Creating a file and then append
                    adlsFileSystemClient.FileSystem.Create(_adlsAccountName, "/raw/Hello.txt",overwrite:true); 
                    // Appending the stream to Azure Data Lake 
                    using(var ms = new MemoryStream())
                    {
                        stream.CopyTo(ms);
                        ms.Position = 0; // rewind
                        log.Info($"**********MemoryStream: {ms}");
                        // do something with ms
                        await adlsFileSystemClient.FileSystem.AppendAsync(_adlsAccountName, "/raw/Hello.txt",ms,0);
                    }
                }
            }                           
        }
        }
    }

新的临时解决方案:

    using (var zipBlobFileStream = new MemoryStream())
    {
        await blockBlob.DownloadToStreamAsync(zipBlobFileStream);
        using (var zip = new ZipArchive(zipBlobFileStream))
        {
            //Each entry here represents an individual file or a folder
            foreach (var entry in zip.Entries)
            {   
                    entry.ExtractToFile(directoryPath + entry.FullName, true);
                    //Upload the File to ADLS
                    var parameters = new UploadParameters(directoryPath + entry.FullName, "/raw/" + md5, _adlsAccountName, isOverwrite: true, maxSegmentLength: 268435456 * 2);
                    var frontend = new Microsoft.Azure.Management.DataLake.StoreUploader.DataLakeStoreFrontEndAdapter(_adlsAccountName, adlsFileSystemClient);
                    var uploader = new DataLakeStoreUploader(parameters, frontend);
                    uploader.Execute();
                    File.Delete(directoryPath + entry.FullName);

            }
        }
    }

1 个答案:

答案 0 :(得分:0)

在您的情况下,您可以更改您的代码,然后它应该工作。您应该从foreach子句中删除创建文件代码。

//Creating a file and then append
adlsFileSystemClient.FileSystem.Create(_adlsAccountName, "/raw/Hello.txt",overwrite:true); 

 using (var zipBlobFileStream = new MemoryStream())
    {
        await blockBlob.DownloadToStreamAsync(zipBlobFileStream);
        await zipBlobFileStream.FlushAsync(); 
        zipBlobFileStream.Position = 0;

        //use ZipArchive from System.IO.Compression to extract all the files from zip file
        using (var zip = new ZipArchive(zipBlobFileStream))
        {
        //Each entry here represents an individual file or a folder
        foreach (var entry in zip.Entries)
        {
            string destfilename = $"{destcontanierPath2}/"+entry.FullName;
            log.Info($"DestFilename: {destfilename}");
            //creating an empty file (blobkBlob) for the actual file with the same name of file
            var blob = extractcontainer.GetBlockBlobReference($"{destfilename}");
            using (var stream = entry.Open())
            {
                //check for file or folder and update the above blob reference with actual content from stream
                if (entry.Length > 0)
                {
                    using (MemoryStream ms = new MemoryStream())
                            {
                                stream.CopyTo(ms);
                                ms.Position = 0;
                                blob.UploadFromStream(ms);
                                ms.Position = 0;                                   
                      adlsFileSystemClient.FileSystem.Append(adlsAccountName, "/raw/Hello.txt", ms);

                            }
                  }
                }
            }                           
        }
        }
    }