将数据写入ADLS中的文件

时间:2019-03-18 14:52:15

标签: c# .net serialization azure-data-lake

我有一个从JSON序列化检索的基类的List对象集合,现在在将数据写入表之前,我需要在Azure数据湖中拥有数据的副本。使用下面的示例代码,我可以创建一个文件夹和示例文件。请指导如何将数据直接列表收集对象写入ADLS中的文件

代码:

Console.WriteLine("Folder Creation Started...");
            Console.WriteLine("================================================");
            var adlsAccountName = "sampledatalake";

            var people = new List<Person> { new Person { FirstName="John", LasttName="Matthew"}, new Person { FirstName = "John", LasttName = "Smith" } };

            var applicationId = "<Applicationid>";
            var secretKey = "<secret key>";
            var tenantId = "<tenantid>";

            var creds = ApplicationTokenProvider.LoginSilentAsync(tenantId, applicationId, secretKey).Result;
            var adlsFileSystemClient = new DataLakeStoreFileSystemManagementClient(creds, clientTimeoutInMinutes: 60);
            var filePath = "/Sample/" + DateTime.Now.Year.ToString() + "/" + DateTime.Now.Month.ToString("00") + "/" + DateTime.Now.Day.ToString("00");

            if (!adlsFileSystemClient.FileSystem.PathExists(adlsAccountName, filePath))
            { 
                adlsFileSystemClient.FileSystem.Mkdirs(adlsAccountName, filePath);
            }

            adlsFileSystemClient.FileSystem.Create(adlsAccountName, filePath+"/Sample.txt", null, null, null, null, null);

================= EDIT ===========

这就是我在ADLS中将数据写入文件时的最终结果,请告诉我这种方法是否有任何限制

  var adlsAccountName = "sampledatalake";

            var people = new List<Person> { new Person { FirstName="John", LasttName="Matthew"}, new Person { FirstName = "John", LasttName = "Smith" } };

           var applicationId = "<Applicationid>";
            var secretKey = "<secret key>";
            var tenantId = "<tenantid>";

            var creds = ApplicationTokenProvider.LoginSilentAsync(tenantId, applicationId, secretKey).Result;
            var adlsFileSystemClient = new DataLakeStoreFileSystemManagementClient(creds, clientTimeoutInMinutes: 60);
            var filePath = "/Sample/" + DateTime.Now.Year.ToString() + "/" + DateTime.Now.Month.ToString("00") + "/" + DateTime.Now.Day.ToString("00");

            if (!adlsFileSystemClient.FileSystem.PathExists(adlsAccountName, filePath))
            { 
                adlsFileSystemClient.FileSystem.Mkdirs(adlsAccountName, filePath);
            }

            adlsFileSystemClient.FileSystem.Create(adlsAccountName, filePath+"/Sample.txt", null, null, null, null, null);

            using (MemoryStream memStreamLikes = new MemoryStream())
            {
                using (TextWriter textWriter = new StreamWriter(memStreamLikes))
                {
                    string text;
                    textWriter.WriteLine("First Name, Last Name");
                    foreach (var item in people)
                    {
                        text = item.FirstName + "," + item.LasttName;
                        textWriter.WriteLine(text);
                    }
                    textWriter.Flush();
                    memStreamLikes.Flush();

                    byte[] textByteArray = memStreamLikes.ToArray();
                    adlsFileSystemClient.FileSystem.Append(adlsAccountName, filePath + "/Sample.txt", new MemoryStream(textByteArray,0,textByteArray.Length), null, null, null, null);
                }
            }

1 个答案:

答案 0 :(得分:0)

这是您需要做的事情:

创建一个datalake客户端对象

var adlsClient = AdlsClient.CreateClient(adlsName, adlCreds);

这是将数据保存到数据湖的tsv文件中的示例方法

public static void ProcessUserLikes(this SocialEntity socialProfile, AdlsClient adlsClient, string fileNameExtension = "")
        {
            using (MemoryStream memStreamLikes = new MemoryStream())
            {
                using (TextWriter textWriter = new StreamWriter(memStreamLikes))
                {
                    string header = FacebookHelper.GetHeader(delim, Entities.FBEnitities.Like);
                    string likes;
                    string fileName = adlsInputPath + fileNameExtension + "/likes.tsv";
                    adlsClient.DataLakeFileHandler(textWriter, header, fileName);

                    for (int i = 0; i < socialProfile.Likes.Count; i++)
                    {
                        for (int j = 0; j < socialProfile.Likes[i].Category_List.Count; j++)
                        {
                            likes = socialProfile.UserID
                                            + delim + socialProfile.FacebookID
                                            + delim + socialProfile.Likes[i].ID
                                            + delim + socialProfile.Likes[i].Name
                                            + delim + socialProfile.Likes[i].Category_List[j].ID
                                            + delim + socialProfile.Likes[i].Category_List[j].Name
                                            + delim + socialProfile.Likes[i].Created_time;
                            textWriter.WriteLine(likes);
                        }
                    }
                    textWriter.Flush();
                    memStreamLikes.Flush();
                    adlsClient.DataLakeUpdateHandler(fileName, memStreamLikes);
                }
            }
        }
        private static void DataLakeFileHandler(this AdlsClient adlsClient, TextWriter textWriter, string header, string fileName = "")
        {
            if (!adlsClient.CheckExists(fileName))
            {
                textWriter.WriteLine(header);
            }
        }

        public static void DataLakeUpdateHandler(this AdlsClient adlsClient, string fileName, MemoryStream memStream)
        {
            if (!adlsClient.CheckExists(fileName))
            {
                using (var stream = adlsClient.CreateFile(fileName, IfExists.Overwrite))
                {
                    byte[] textByteArray = memStream.ToArray();
                    stream.Write(textByteArray, 0, textByteArray.Length);
                }
            }
            else
            {
                memStream.Seek(0, SeekOrigin.Begin);
                using (var stream = adlsClient.GetAppendStream(fileName))
                {
                    byte[] textByteArray = memStream.ReadFully();
                    if (textByteArray.Length > 0)
                    {
                        stream.Write(textByteArray, 0, textByteArray.Length);
                    }
                }
            }
        }
        public static byte[] ReadFully(this MemoryStream input)
        {
            using (MemoryStream ms = new MemoryStream())
            {
                input.CopyTo(ms);
                return ms.ToArray();
            }
        }

您可以根据需要进行修改,并提供了创建和更新文件的示例方法。

希望有帮助。