https://docs.microsoft.com/en-us/azure/data-factory/data-factory-load-sql-data-warehouse。根据1000 DWU和polybase的链接,我应该获得200MBps的吞吐量。但我得到4.66 MBps。我在xlargerc资源类中添加了用户,以便从azure sql datawarehouse获得最佳吞吐量。
以下是Pipeline JSON。
{
"name": "UCBPipeline-Copy",
"properties": {
"description": "pipeline with copy activity",
"activities": [
{
"type": "Copy",
"typeProperties": {
"source": {
"type": "BlobSource"
},
"sink": {
"type": "SqlDWSink",
"allowPolyBase": true,
"writeBatchSize": 0,
"writeBatchTimeout": "00:00:00"
},
"cloudDataMovementUnits": 4
},
"inputs": [
{
"name": "USBBlob_Concept
}
],
"outputs": [
{
"name": "AzureDW_Concept"
}
],
"policy": {
"timeout": "01:00:00",
"concurrency": 1
},
"scheduler": {
"frequency": "Day",
"interval": 1
},
"name": "AzureBlobtoSQLDW_Concept",
"description": "Copy Activity"
}
],
"start": "2017-02-28T18:00:00Z",
"end": "2017-03-01T19:00:00Z",
"isPaused": false,
"hubName": "sampledf1_hub",
"pipelineMode": "Scheduled"
}
}
输入数据集:
{
"name": "AzureBlob_Concept",
"properties": {
"published": false,
"type": "AzureBlob",
"linkedServiceName": "AzureZRSStorageLinkedService",
"typeProperties": {
"fileName": "conceptTab.txt",
"folderPath": "source/",
"format": {
"type": "TextFormat",
"columnDelimiter": "\t"
}
},
"availability": {
"frequency": "Day",
"interval": 1
},
"external": true,
"policy": {}
}
}
输出数据集:
{
"name": "AzureDW_Concept",
"properties": {
"published": false,
"type": "AzureSqlDWTable",
"linkedServiceName": "AzureSqlDWLinkedService",
"typeProperties": {
"tableName": "concept"
},
"availability": {
"frequency": "Day",
"interval": 1
}
}
}
配置中是否缺少任何内容?
答案 0 :(得分:2)
我看了一下runId“e98ac557-a507-4a6e-8833-978eff1723c3”,它应该属于你的复制活动。从我们的服务日志来看,源文件不够大(在您的情况下为270 MB),因此服务调用延迟会使吞吐量不够好。您可以尝试加载更大的文件以获得更好的吞吐量。