答案 0 :(得分:4)
Boto3中没有内置功能。但是你可以写自己的服务员。
请参阅:describe_step
使用describe_step
和cluster_id
致电step_id
。响应是一个字典,其中包含有关该步骤的详细信息。其中一个关键是' State'有关于步骤状态的信息。如果状态未完成,请等待几秒钟再试一次,直到它完成或等待时间超过您的限制。
'State': 'PENDING'|'CANCEL_PENDING'|'RUNNING'|'COMPLETED'|'CANCELLED'|'FAILED'|'INTERRUPTED'
答案 1 :(得分:4)
我提出了以下代码(如果将max_attempts
设置为0或更小,那么它将等待直到没有正在运行/挂起的步骤):
def wait_for_steps_completion(emr_client, emr_cluster_id, max_attempts=0):
sleep_seconds = 30
num_attempts = 0
while True:
response = emr_client.list_steps(
ClusterId=emr_cluster_id,
StepStates=['PENDING', 'CANCEL_PENDING', 'RUNNING']
)
num_attempts += 1
active_aws_emr_steps = response['Steps']
if active_aws_emr_steps:
if 0 < max_attempts <= num_attempts:
raise Exception(
'Max attempts exceeded while waiting for AWS EMR steps completion. Last response:\n'
+ json.dumps(response, indent=3, default=str)
)
time.sleep(sleep_seconds)
else:
return
答案 2 :(得分:4)
现在有一个服务员可用于步骤完成事件。它是在最近的boto3版本中添加的。
http://boto3.readthedocs.io/en/latest/reference/services/emr.html#EMR.Waiter.StepComplete
示例代码:
import boto3
client = boto3.client("emr")
waiter = client.get_waiter("step_complete")
waiter.wait(
ClusterId='the-cluster-id',
StepId='the-step-id',
WaiterConfig={
"Delay": 30,
"MaxAttempts": 10
}
)
答案 3 :(得分:0)
我在GitHub上编写了一个通用的status_poller函数作为EMR交互式演示的一部分。
status_poller函数循环并调用一个函数,打印“。”。或新状态,直到返回指定状态:
def status_poller(intro, done_status, func):
"""
Polls a function for status, sleeping for 10 seconds between each query,
until the specified status is returned.
:param intro: An introductory sentence that informs the reader what we're
waiting for.
:param done_status: The status we're waiting for. This function polls the status
function until it returns the specified status.
:param func: The function to poll for status. This function must eventually
return the expected done_status or polling will continue indefinitely.
"""
status = None
print(intro)
print("Current status: ", end='')
while status != done_status:
prev_status = status
status = func()
if prev_status == status:
print('.', end='')
else:
print(status, end='')
sys.stdout.flush()
time.sleep(10)
print()
要检查步骤是否完成,您可以这样称呼它:
status_poller(
"Waiting for step to complete...",
'COMPLETED',
lambda:
emr_basics.describe_step(cluster_id, step_id, emr_client)['Status']['State'])