我正在查看下面的链接。
我们应该能够在文件夹路径和文件名中使用通配符。如果单击“活动”,然后单击“源”,则会看到此视图。
我想每天循环浏览几个月,所以应该像这样的视图。
当然,这实际上不起作用。我收到以下错误消息:ErrorCode:“ PathNotFound”。消息:“指定的路径不存在。”。给定文件路径和文件名中的特定字符串模式,如何获得该工具以递归方式遍历所有文件夹中的所有文件?谢谢。
答案 0 :(得分:1)
我想每天循环浏览几个月
让我们一一开始:
注意:如果需要,也可以从其他活动的输出中传递此参数。参考:Parameters in ADF
2。创建两个数据集。
2.1接收器数据集-此处为Blob存储。将其与您的链接服务链接,并提供容器名称(确保它存在)。同样,如果需要,可以将其作为参数传递。
2.2源数据集-再次在此处存储Blob或根据您的需要。将其与您的链接服务链接,并提供容器名称(确保它存在)。同样,如果需要,可以将其作为参数传递。
注意:
1.文件夹路径决定复制数据的路径。如果该容器不存在,则将为您创建活动,并且如果该文件已存在,则默认情况下该文件将被覆盖。
2.如果要动态构建输出路径,请在数据集中传递参数。在这里,我为数据集创建了两个参数,分别名为monthcopy和datacopy。
3。在管道中创建复制活动。
通配符文件夹路径:
@{concat(formatDateTime(adddays(utcnow(),-1),'yyyy'),'/',string(pipeline().parameters.month),'/',string(pipeline().parameters.day),'/*')}
where:
The path will become as: current-yyyy/month-passed/day-passed/* (the * will take any folder on one level)
{
"name": "pipeline2",
"properties": {
"activities": [
{
"name": "Copy Data1",
"type": "Copy",
"dependsOn": [],
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"source": {
"type": "DelimitedTextSource",
"storeSettings": {
"type": "AzureBlobStorageReadSettings",
"recursive": true,
"wildcardFolderPath": {
"value": "@{concat(formatDateTime(adddays(utcnow(),-1),'yyyy'),'/',string(pipeline().parameters.month),'/',string(pipeline().parameters.day),'/*')}",
"type": "Expression"
},
"wildcardFileName": "*.csv",
"enablePartitionDiscovery": false
},
"formatSettings": {
"type": "DelimitedTextReadSettings"
}
},
"sink": {
"type": "DelimitedTextSink",
"storeSettings": {
"type": "AzureBlobStorageWriteSettings"
},
"formatSettings": {
"type": "DelimitedTextWriteSettings",
"quoteAllText": true,
"fileExtension": ".csv"
}
},
"enableStaging": false
},
"inputs": [
{
"referenceName": "DelimitedText1",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "DelimitedText2",
"type": "DatasetReference",
"parameters": {
"monthcopy": {
"value": "@pipeline().parameters.month",
"type": "Expression"
},
"datacopy": {
"value": "@pipeline().parameters.day",
"type": "Expression"
}
}
}
]
}
],
"parameters": {
"month": {
"type": "string"
},
"day": {
"type": "string"
}
},
"annotations": []
}
}
{
"name": "DelimitedText1",
"properties": {
"linkedServiceName": {
"referenceName": "AzureBlobStorage1",
"type": "LinkedServiceReference"
},
"annotations": [],
"type": "DelimitedText",
"typeProperties": {
"location": {
"type": "AzureBlobStorageLocation",
"container": "corpdata"
},
"columnDelimiter": ",",
"escapeChar": "\\",
"quoteChar": "\""
},
"schema": []
}
}
{
"name": "DelimitedText2",
"properties": {
"linkedServiceName": {
"referenceName": "AzureBlobStorage1",
"type": "LinkedServiceReference"
},
"parameters": {
"monthcopy": {
"type": "string"
},
"datacopy": {
"type": "string"
}
},
"annotations": [],
"type": "DelimitedText",
"typeProperties": {
"location": {
"type": "AzureBlobStorageLocation",
"folderPath": {
"value": "@concat(formatDateTime(adddays(utcnow(),-1),'yyyy'),dataset().monthcopy,'/',dataset().datacopy)",
"type": "Expression"
},
"container": "copycorpdata"
},
"columnDelimiter": ",",
"escapeChar": "\\",
"quoteChar": "\""
},
"schema": []
}
}