Azure数据工厂的复制数据操作利用 Cosmo dB SQL API作为带有查询的源,用于将数据复制到目标Azure SQL DWH在被触发时仅复制30个文档。
cosmo db中有20,000多个文档。我在SQL DWH的Sink中利用带有临时区域的poly base。
我需要做些什么设置才能将所有数据从Cosmo dB SQL API复制到Azure SQL DWH?
设置:
我有一个工作正常的ADF管道,该管道仅具有复制操作,该复制操作使用对源数据的查询将数据从Cosmo DB复制到SQL DWH表。从源到目标的映射没有错误。 Monitor在Azure数据工厂中没有错误。
但是无论我提供什么设置,它只会将30个文档复制到SQL DWH表中。
{
"name": "cmctodwh",
"properties": {
"activities": [
{
"name": "Copy_ie8",
"type": "Copy",
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [
{
"name": "Source",
"value": "cmc"
},
{
"name": "Destination",
"value": "[dbo].[tb_cmc]"
}
],
"typeProperties": {
"source": {
"type": "DocumentDbCollectionSource",
"query": "select \r\n c.id, \r\n c.owner.cid, \r\n c.owner.role, \r\n c.owner.name,\r\n mc.createdDateTime,\r\n mc.modifiedDateTime,\r\n mc.cmcId from root c join mc in c.ownerAccess ",
"nestingSeparator": ""
},
"sink": {
"type": "SqlDWSink",
"allowPolyBase": true,
"writeBatchSize": 10000,
"polyBaseSettings": {
"rejectValue": 0,
"rejectType": "value",
"useTypeDefault": true
}
},
"enableStaging": true,
"stagingSettings": {
"linkedServiceName": {
"referenceName": "StagingForPolyBase",
"type": "LinkedServiceReference"
},
"enableCompression": true
},
"translator": {
"type": "TabularTranslator",
"columnMappings": {
"id": "id",
"cid": "cid",
"role": "role",
"name": "name",
"createdDateTime": "createdDateTime",
"modifiedDateTime": "modifiedDateTime",
"cmcId": "cmcId"
}
}
},
"inputs": [
{
"referenceName": "SourceDataset_ie8",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "DestinationDataset_ie8",
"type": "DatasetReference"
}
]
}
]
},
"type": "Microsoft.DataFactory/factories/pipelines"
}
我希望复制活动将所有文档从cosmo db复制到SQL DWH,而不是30个。
我怀疑由于cosmo db上的分页,它只能复制30个吗?