数据工厂-复制操作仅将30条记录从Cosmo dB查询输出复制到Azure SQL数据仓库

时间:2019-01-17 17:24:05

标签: azure-cosmosdb azure-data-factory sql-data-warehouse

Azure数据工厂的复制数据操作利用 Cosmo dB SQL API作为带有查询的源,用于将数据复制到目标Azure SQL DWH在被触发时仅复制30个文档。

cosmo db中有20,000多个文档。我在SQL DWH的Sink中利用带有临时区域的poly base。

我需要做些什么设置才能将所有数据从Cosmo dB SQL API复制到Azure SQL DWH?

设置:

  • 仅具有复制操作的Azure Data Factory管道
  • 使用SQL API和SQL查询提取数据的cosmo db的源数据集;
  • SQL DWH的目标数据集
  • Cosmo dB收集-分区收集
  • SQL DWH表

我有一个工作正常的ADF管道,该管道仅具有复制操作,该复制操作使用对源数据的查询将数据从Cosmo DB复制到SQL DWH表。从源到目标的映射没有错误。 Monitor在Azure数据工厂中没有错误。

但是无论我提供什么设置,它只会将30个文档复制到SQL DWH表中。

{
"name": "cmctodwh",
"properties": {
    "activities": [
        {
            "name": "Copy_ie8",
            "type": "Copy",
            "policy": {
                "timeout": "7.00:00:00",
                "retry": 0,
                "retryIntervalInSeconds": 30,
                "secureOutput": false,
                "secureInput": false
            },
            "userProperties": [
                {
                    "name": "Source",
                    "value": "cmc"
                },
                {
                    "name": "Destination",
                    "value": "[dbo].[tb_cmc]"
                }
            ],
            "typeProperties": {
                "source": {
                    "type": "DocumentDbCollectionSource",
                    "query": "select \r\n    c.id, \r\n    c.owner.cid, \r\n    c.owner.role, \r\n    c.owner.name,\r\n    mc.createdDateTime,\r\n    mc.modifiedDateTime,\r\n    mc.cmcId from root c join mc in c.ownerAccess ",
                    "nestingSeparator": ""
                },
                "sink": {
                    "type": "SqlDWSink",
                    "allowPolyBase": true,
                    "writeBatchSize": 10000,
                    "polyBaseSettings": {
                        "rejectValue": 0,
                        "rejectType": "value",
                        "useTypeDefault": true
                    }
                },
                "enableStaging": true,
                "stagingSettings": {
                    "linkedServiceName": {
                        "referenceName": "StagingForPolyBase",
                        "type": "LinkedServiceReference"
                    },
                    "enableCompression": true
                },
                "translator": {
                    "type": "TabularTranslator",
                    "columnMappings": {
                        "id": "id",
                        "cid": "cid",
                        "role": "role",
                        "name": "name",
                        "createdDateTime": "createdDateTime",
                        "modifiedDateTime": "modifiedDateTime",
                        "cmcId": "cmcId"
                    }
                }
            },
            "inputs": [
                {
                    "referenceName": "SourceDataset_ie8",
                    "type": "DatasetReference"
                }
            ],
            "outputs": [
                {
                    "referenceName": "DestinationDataset_ie8",
                    "type": "DatasetReference"
                }
            ]
        }
    ]
},
"type": "Microsoft.DataFactory/factories/pipelines"
}

我希望复制活动将所有文档从cosmo db复制到SQL DWH,而不是30个。

我怀疑由于cosmo db上的分页,它只能复制30个吗?

0 个答案:

没有答案