使用Azure Data Factory从{C#代码运行U-SQL脚本

时间:2017-01-16 12:51:23

标签: c# azure azure-data-factory azure-data-lake u-sql

我试图通过C#代码在Azure上运行U-SQL脚本。在执行代码之后,所有内容都在azure(ADF,链接服务,管道,数据集)上创建,但ADF不执行U-SQl脚本。我认为管道代码中配置的startTime和end Time存在问题。

我按照以下文章来完成此控制台应用程序。 Create, monitor, and manage Azure data factories using Data Factory .NET SDK

以下是我完整的C#代码项目的URL供下载。 https://1drv.ms/u/s!AltdTyVEmoG2ijOupx-EjCM-8Zk4

有人请帮助我找出我的错误

用于配置管道的C#代码:

DateTime PipelineActivePeriodStartTime = new DateTime(2017,1,12,0,0,0,0,DateTimeKind.Utc);             DateTime PipelineActivePeriodEndTime = PipelineActivePeriodStartTime.AddMinutes(60);             string PipelineName =“ComputeEventsByRegionPipeline”;

        var usqlparams = new Dictionary<string, string>();
        usqlparams.Add("in", "/Samples/Data/SearchLog.tsv");
        usqlparams.Add("out", "/Output/testdemo1.tsv");

        client.Pipelines.CreateOrUpdate(resourceGroupName, dataFactoryName,
        new PipelineCreateOrUpdateParameters()
        {
            Pipeline = new Pipeline()
            {
                Name = PipelineName,
                Properties = new PipelineProperties()
                {
                    Description = "This is a demo pipe line.",

                    // Initial value for pipeline's active period. With this, you won't need to set slice status
                    Start = PipelineActivePeriodStartTime,
                    End = PipelineActivePeriodEndTime,
                    IsPaused = false,

                    Activities = new List<Activity>()
                    {
                        new Activity()
                        {
                            TypeProperties = new DataLakeAnalyticsUSQLActivity("@searchlog = EXTRACT UserId int, Start DateTime, Region string, Query string, Duration int?, Urls string, ClickedUrls string FROM @in USING Extractors.Tsv(nullEscape:\"#NULL#\"); @rs1 = SELECT Start, Region, Duration FROM @searchlog; OUTPUT @rs1 TO @out USING Outputters.Tsv(quoting:false);")
                            {
                                DegreeOfParallelism = 3,
                                Priority = 100,
                                Parameters = usqlparams
                            },
                            Inputs = new List<ActivityInput>()
                            {
                                new ActivityInput(Dataset_Source)
                            },
                            Outputs = new List<ActivityOutput>()
                            {
                                new ActivityOutput(Dataset_Destination)
                            },
                            Policy = new ActivityPolicy()
                            {
                                Timeout = new TimeSpan(6,0,0),
                                Concurrency = 1,
                                ExecutionPriorityOrder = ExecutionPriorityOrder.NewestFirst,
                                Retry = 1
                            },
                            Scheduler = new Scheduler()
                            {
                                Frequency = "Day",
                                Interval = 1
                            },
                            Name = "EventsByRegion",
                            LinkedServiceName = "AzureDataLakeAnalyticsLinkedService"
                        }
                    }
                }
            }
        });

我刚注意到azure数据工厂视图(监视和管理选项)中的某些内容。 Pipeline的状态为 Waiting:DatasetDependencies Azure Data Factory Monitor and Manage view我是否需要在代码中修改内容?

1 个答案:

答案 0 :(得分:2)

如果您没有其他活动正在创建源数据集,则需要向其添加属性

"external": true

https://docs.microsoft.com/en-us/azure/data-factory/data-factory-faq

https://docs.microsoft.com/en-us/azure/data-factory/data-factory-create-datasets