如何在AWS StepFunctions中将变量传递给EMR addStep

时间:2019-12-19 21:42:45

标签: amazon-emr aws-step-functions

AWS Stepfunctions最近添加了EMR集成,这很酷,但是我找不到将step函数中的变量传递给addstep args的方法。 例如,我想将“ $ .dayid”变量传递到“ Parameters”>“ Step”>“ HadoopJarStep”> Args中。类似于“ ClusterId。$”:“ $ .ClusterId”(此集群ID变量有效)。

{
    "Step_One": {
    "Type": "Task",
    "Resource": "arn:aws:states:::elasticmapreduce:addStep.sync",
    "Parameters": {
        "ClusterId.$": "$.ClusterId",
        "Step": {
            "Name": "The first step",
            "ActionOnFailure": "CONTINUE",
            "HadoopJarStep": {
                "Jar": "command-runner.jar",
                "Args": [
                    "hive-script",
                    "--run-hive-script",
                    "--args",
                    "-f",
                    "s3://<region>.elasticmapreduce.samples/cloudfront/code/Hive_CloudFront.q",
                    "-d",
                    "INPUT=s3://<region>.elasticmapreduce.samples",
                    "-d",
                    "OUTPUT=s3://<mybucket>/MyHiveQueryResults/$.dayid"
                ]
            }
        }
    },
    "End": true
}

1 个答案:

答案 0 :(得分:2)

参数允许您定义键值对,因此“ Args”键的值是一个数组,因此您将无法动态引用该数组中的特定元素,因此需要引用整个数组代替。例如“ Args。$”:“ $。Input.ArgsArray”。话虽如此,您也将无法像在“ OUTPUT = s3:///MyHiveQueryResults/$.dayid”中尝试的那样,在字符串内引用替换值。

因此对于您的用例,实现此目的的最佳方法是在调用此状态之前添加预处理状态。在预处理状态下,我建议您调用Lambda函数来构造字符串“ OUTPUT = s3:///MyHiveQueryResults/$.dayid”以及发送给Args的完整数组。

{
    "StartAt": "Pre-Process",
    "States": {
        "Pre-Process": {
            "Type": "Task",
            "Resource": "<Lambda function to generate the string OUTPUT=s3://<mybucket>/MyHiveQueryResults/$.dayid and output the Args array>",
            "Next": "Step_One"
        },
        "Step_One": {
            "Type": "Task",
            "Resource": "arn:aws:states:::elasticmapreduce:addStep.sync",
            "Parameters": {
                "ClusterId.$": "$.ClusterId",
                "Step": {
                    "Name": "The first step",
                    "ActionOnFailure": "CONTINUE",
                    "HadoopJarStep": {
                        "Jar": "command-runner.jar",
                        "Args.$": "$.ArgsGeneratedByPreProcessingState"
                    }
                }
            },
            "End": true
        }
    }
}