我有一个包含两个任务的管道。任务2取决于任务1,并且两个任务的maxActiveInstances都设置为1。尽管存在这种依赖关系,但在某些情况下,任务2与任务1同时运行。例如,如果任务2花费太长时间并且达到管道下次执行的预定开始时间,则任务1同时开始运行。回填时也会发生同样的事情。
由于这两个任务相互干扰,我不希望它们在任何情况下同时运行。理想情况下,我只想要一次运行管道实例(而不是单个任务)。但我无法弄清楚如何做到这一点。
以下是使用...
取代不感兴趣的部分时管道的样子:
{
"objects": [
{
"period": "15 Minutes",
"name": "Every 15 minutes",
"id": "DefaultSchedule",
"type": "Schedule",
"startAt": "FIRST_ACTIVATION_DATE_TIME"
},
{
"failureAndRerunMode": "CASCADE",
"resourceRole": "...",
"role": "...",
"pipelineLogUri": "...",
"scheduleType": "cron",
"schedule": {
"ref": "DefaultSchedule"
},
"maxActiveInstances": "1",
"name": "Default",
"id": "Default"
},
{
"name": "CopyTablesActivity",
"id": "CopyTablesActivity",
"workerGroup": "dp01",
"type": "ShellCommandActivity",
"command": "..."
},
{
"name": "CreateReportsActivity",
"id": "CreateReportsActivity",
"workerGroup": "dp01",
"type": "ShellCommandActivity",
"command": "...",
"dependsOn": {
"ref": "CopyTablesActivity"
}
}
],
"parameters": [...]
}
答案 0 :(得分:0)
在CopyTablesActivity上,您可以将lateAfterTimeout
属性设置为5分钟左右,然后添加名为onLateAction的属性,并将其设置为终止。我们的想法是,如果5分钟后CopyTablesActivity没有完成,则终止管道。例如,CopyTablesActivity对象可能如下所示:
{
"name": "CopyTablesActivity",
"id": "CopyTablesActivity",
"workerGroup": "dp01",
"lateAfterTimeout" : "5 minutes",
"type": "ShellCommandActivity",
"onLateAction" : {
"ref" : "DefaultAction1"
}
"command": "..."
}
然后,你可以这样定义DefaultAction1:
{
"name" : "TerminateTasks",
"id" : "DefaultAction1",
"type" : "Terminate"
}
有关详细信息,请参阅此链接:https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-object-terminate.html