输入数据集无效

时间:2016-12-22 07:04:09

标签: azure-data-factory azure-data-lake u-sql

我已经创建了一个azure数据工厂来使用“DataLakeAnalyticsU-SQL”活动来安排U-SQL脚本。请参阅以下代码:

InputDataset
{
"name": "InputDataLakeTable",
"properties": {
    "published": false,
    "type": "AzureDataLakeStore",
    "linkedServiceName": "LinkedServiceSource",
    "typeProperties": {
        "fileName": "SearchLog.txt",
        "folderPath": "demo/",
        "format": {
            "type": "TextFormat",
            "rowDelimiter": "\n",
            "columnDelimiter": "|",
            "quoteChar": "\""
        }
    },
    "availability": {
        "frequency": "Hour",
        "interval": 1
    }
}

}

OutputDataset:
{
"name": "OutputDataLakeTable",
"properties": {
    "published": false,
    "type": "AzureDataLakeStore",
    "linkedServiceName": "LinkedServiceDestination",
    "typeProperties": {
        "folderPath": "scripts/"
    },
    "availability": {
        "frequency": "Hour",
        "interval": 1
    }
}

}

Pipeline:
{
"name": "ComputeEventsByRegionPipeline",
"properties": {
    "description": "This is a pipeline to compute events for en-gb locale and date less than 2012/02/19.",
    "activities": [
        {
            "type": "DataLakeAnalyticsU-SQL",
            "typeProperties": {
                "scriptPath": "scripts\\SearchLogProcessing.txt",
                "degreeOfParallelism": 3,
                "priority": 100,
                "parameters": {
                    "in": "/demo/SearchLog.txt",
                    "out": "/scripts/Result.txt"
                }
            },
            "inputs": [
                {
                    "name": "InputDataLakeTable"
                }
            ],
            "outputs": [
                {
                    "name": "OutputDataLakeTable"
                }
            ],
            "policy": {
                "timeout": "06:00:00",
                "concurrency": 1,
                "executionPriorityOrder": "NewestFirst",
                "retry": 1
            },
            "scheduler": {
                "frequency": "Hour",
                "interval": 1
            },
            "name": "CopybyU-SQL",
            "linkedServiceName": "AzureDataLakeAnalyticsLinkedService"
        }
    ],
    "start": "2016-12-21T17:44:13.557Z",
    "end": "2016-12-22T17:44:13.557Z",
    "isPaused": false,
    "hubName": "denojaidbfactory_hub",
    "pipelineMode": "Scheduled"
}

}

我已成功创建所有必需的链接服务。 但是在部署管道之后,没有为输入数据集创建时间片。见下图: No time slice created for input dataset

而输出数据集期望上游输入数据集时间片。因此,输出数据集的时间片仍处于挂起的执行状态,并且我的Azure数据工厂管道无法正常工作。 见下图: Output dataset is expecting a time slice from input dataset and remains in pending state 任何解决此问题的建议。

1 个答案:

答案 0 :(得分:2)

如果您没有其他活动正在创建InputDataLakeTable,则需要添加属性

"external": true

https://docs.microsoft.com/en-us/azure/data-factory/data-factory-faq

https://docs.microsoft.com/en-us/azure/data-factory/data-factory-create-datasets