我创建了一个状态机来并行运行一些Glue / ETL作业。我正在试验Map状态以利用动态并行性。这是步骤函数定义:
{
"StartAt": "Map",
"States": {
"Map": {
"Type": "Map",
"InputPath": "$.data",
"ItemsPath": "$.array",
"MaxConcurrency": 2,
"Iterator": {
"StartAt": "glue job",
"States": {
"glue Job": {
"Type": "Task",
"Resource": "arn:aws:states:::glue:startJobRun.sync",
"End": true,
"Parameters": {
"JobName": "glue-etl-job",
"Arguments": {
"--db": "db-dev",
"--file": "$.file",
"--bucket": "$.bucket"
}
}
}
}
},
"Catch": [
{
"ErrorEquals": [
"States.ALL"
],
"Next": "NotifyError"
}
],
"Next": "NotifySuccess"
},
}
}
传递给step函数的输入格式如下:
{
"data": {
"array": [
{"file": "path-to-file1", "bucket": "bucket-name1"},
{"file": "path-to-file2", "bucket": "bucket-name2"},
]
}
}
问题是file
和bucket
作业自变量没有得到解决,它们被传递给$.file
和$.bucket
之类的粘合作业。如何从输入中传递参数的实际值?
答案 0 :(得分:1)
使用状态字段作为参数时,需要在参数的“。$”末尾添加。
"--file.$": "$.file",
"--bucket.$": "$.bucket"
有关完整指南,请查看规格表。 https://states-language.net/spec.html#parameters