在我的DAG中,EMRStepSensor在观察步骤时失败。 这是DAG
dag >>初始化程序>> create_cluster >> emr_step >> watch_step >> cluster_remover
初始化DAG,创建EMR集群,成功添加EMR步骤。 EMRStepSensor失败,并显示错误
botocore.exceptions.ClientError:调用DescribeStep操作时发生错误(500)(已达到最大重试次数:4):内部服务器错误
我验证了python代码。
emr_step_sensor.py调用client.py并调用Describe操作并失败
initializer = InitializerOperator(
task_id='initializer',
dag=dag
)
job_flow = EmrJobFlowOperator(
task_id='create_cluster',
aws_conn_id='worflow_aws',
config_file='dev_loader.cfg',
initializer_task_id='initializer',
core_instance_count=20,
dag=dag
)
step_add = EmrAddStepOperator(
task_id='emr_step',
aws_conn_id='worflow_aws',
config_file='dev_loader_step.cfg',
job_flow_id="{{ task_instance.xcom_pull('create_cluster', key='return_value') }}",
step_id='cassandraloader',
initializer_task_id='initializer',
dag=dag
)
step_checker = EmrStepCheckSensor(
task_id='watch_step',
aws_conn_id='worflow_aws',
poke_interval=300,
job_flow_id="{{ task_instance.xcom_pull('create_cluster', key='return_value') }}",
step_id="{{ task_instance.xcom_pull('emr_step', key='return_value')[0] }}",
dag=dag
)
cluster_remover = EmrTerminateJobFlowOperator(
task_id='remove_cluster',
aws_conn_id='worflow_aws',
job_flow_id="{{ task_instance.xcom_pull('create_cluster', key='return_value') }}",
dag=dag
)
EMRStepsensor必须继续监视步骤的状态。