从azure datafactory

时间:2017-12-20 10:07:15

标签: azure-sql-database azure-data-factory

我正在尝试从Azure DataFactory V2中的azure SQL DB中执行存储过程。该过程将使用平面表中的数据进行一些插入到不同的表中。根据{{​​3}},你需要有一个Table值参数来做这样的事情,但是它将管道活动耦合到过程和所有模型。有没有办法定义数据集和复制活动,以便它只执行存储过程?

以下jsons来自手臂模板:

DataSet:    
{"type": "datasets",
          "name": "AzureSQLProcedureDS",
          "dependsOn": [
            "[parameters('dataFactoryName')]",
            "[parameters('destinationLinkedServiceName')]"
          ],
          "apiVersion": "[variables('apiVersion')]",
          "properties": {
            "type": "AzureSqlTable",
            "linkedServiceName": {
              "referenceName": "[parameters('destinationLinkedServiceName')]",
              "type": "LinkedServiceReference"
            },
            "typeProperties": {
              "tableName": "storeProcedureExecutions"
            }
          }}




    Activity:
    {"name": "ExecuteHarmonizationProcedure",
                    "description": "Executes the procedure that Harmonizes the Data",
                    "type": "Copy",
                    "inputs": [
                      {
                        "referenceName": "[parameters('destinationDataSetName')]",
                        "type": "DatasetReference"
                      }
                    ],
                    "outputs": [
                      {
                        "referenceName": "AzureSQLProcedureDS",
                        "type": "DatasetReference"
                      }
                    ],
                    "typeProperties": {
                      "source": {
                        "type": "SqlSink"
                      },
                      "sink": {
                        "type": "SqlSink",
                        //"SqlWriterTableType": "storeProcedureExecutionsType",
                        "SqlWriterStoredProcedureName": "@Pipeline().parameters.procedureName",
                        "storedProcedureParameters": {
                          "param1": {
                            "value": "call from adf" 
                          }
                        }
                      }
                    }
}

考虑到MS对此主题没有提供太多帮助,我们将不胜感激。

2 个答案:

答案 0 :(得分:2)

我不确定我是否正确理解了问题,您只是想从复制活动中调用存储过程?

这样做非常简单,在复制活动中,您可以在源代码中定义sqlReaderQuery属性。此属性允许您输入t-sql命令,因此您可以执行以下操作:

 "typeProperties": {
        "source": {
            "type": "SqlSource",
            "sqlReaderQuery": "EXEC sp_Name; select 1 as test"
        },
 . . .

复制活动始终需要查询的结果,因此如果您只包含对存储过程的调用,那么我不会包含查询的第二部分。

替换您想要使用的参数,就是这样。

答案 1 :(得分:1)

根据@Martin的建议,我们设法让执行工作。以下是我们的工作:

  1. 在sql中创建虚拟表:

    CREATE TABLE [dbo].[dummyTable]( [col1] [nvarchar](100) NULL )

  2. 在SQL中创建SP:

    CREATE PROCEDURE [dbo].[sp_testHarmonize] @param1 NVARCHAR(200) AS BEGIN INSERT INTO storeProcedureExecutions VALUES (@param1,getdate()); END

  3. SP的数据集:

    { "type": "datasets", "name": "[parameters('dummySQLTableDataSet')]", "dependsOn": ["[parameters('dataFactoryName')]", "[parameters('datalakeLinkedServiceName')]"], "apiVersion": "[variables('apiVersion')]", "properties": { "type": "AzureSqlTable", "linkedServiceName": { "referenceName": "[parameters('databaseLinkedServiceName')]", "type": "LinkedServiceReference" }, "typeProperties": { "tableName": "dummyTable" } } }

  4. 管道活动:

    { "name": "ExecuteHarmonizationProcedure", "dependsOn": [{ "activity": "CopyCSV2SQL", "dependencyConditions": ["Succeeded"] }], "description": "Executes the procedure that Harmonizes the Data", "type": "Copy", "inputs": [{ "referenceName": "[parameters('dummySQLTableDataSet')]", "type": "DatasetReference" }], "outputs": [{ "referenceName": "[parameters('dummySQLTableDataSet')]", "type": "DatasetReference" }], "typeProperties": { "source": { "type": "SqlSource", "sqlReaderQuery": "@Pipeline().parameters.SQLCommand" }, "sink": { "type": "SqlSink" } } }

  5. 使用以下sql命令参数运行管道:

    $"EXEC sp_testHarmonize 'call from ADF at {DateTime.Now}'; select top 1 * from dummyTable;"

  6. 这使它起作用,但考虑到它在虚拟表上插入一行,它看起来更像是一种解决方案而不是直接解决方案。如果没有更直接的解决方案,这是最简单的方法。