每次启动Sagemaker笔记本时,我都想启动一个EMR群集。 但是,我发现Lifecycle配置脚本不能运行超过5分钟。不幸的是,我的EMR集群需要花费5分钟以上的时间才能启动。这是一个问题,因为我需要等待集群启动才能检索主ip地址(然后使用该ip地址配置sagemaker notebbok和集群之间的连接)。
下面摘录了运行到生命周期配置脚本中的代码。
那里有人遇到类似的问题并找到了解决方案吗?
job_flow_id = client.run_job_flow(**CLUSTER_CONFIG)['JobFlowId']
...
...
# Retrieve private Ip of master node for later use
master_instance = client.list_instances(ClusterId=job_flow_id, InstanceGroupTypes=['MASTER'])['Instances'][0]
master_private_ip = master_instance['PrivateIpAddress']
# Send to sagemaker the config file in order to tell him how to communicate with spark
s3 = boto3.client('s3')
file_object = s3.get_object(Bucket='dataengine', Key='emr/example_config.json')
data = json.loads(file_object['Body'].read().decode('utf-8'))
data['kernel_python_credentials']['url'] = 'http://{}:8998'.format(master_private_ip)
data['kernel_scala_credentials']['url'] = 'http://{}:8998'.format(master_private_ip)
data['kernel_r_credentials']['url'] = 'http://{}:8998'.format(master_private_ip)
with open('./sparkmagic/config.json', 'w') as outfile:
json.dump(data, outfile)```
答案 0 :(得分:1)
答案 1 :(得分:0)
AWS CloudFormation将为您自动完成所有操作,并让您也传递IP地址。
您似乎喜欢Python,所以我建议您使用Troposphere:https://github.com/cloudtools/troposphere。编写Python代码,生成CloudFormation模板,然后运行它:)
Julien(AWS)