通过Userdata.txt停顿在AWS Deep Learning AMI上安装环境

时间:2019-03-18 02:53:33

标签: amazon-web-services tensorflow boto3

我尝试通过userdata.txt在AWS Deep Learning AMI中安装/激活虚拟环境,但是该过程似乎陷入了困境。有人可以帮忙吗?

import boto3

user_data = u"""#!/bin/bash

cd /home/ec2-user

echo cd /home/ec2-user > test.sh
echo source activate tensorflow_p36 >> test.sh
echo "Made it 1" >> test.sh

su ec2-user -c "bash test.sh" -l

echo "Made it 2"

"""

# Spec throwaway instance
instance_spec = {
    'ImageId' : 'ami-055ab192b68ca4d2f',      # DLAMI Conda
    'InstanceType' : 't2.small',
    'KeyName': 'XXX',
    'SecurityGroupIds': ['XXX'],
    'Placement': {'AvailabilityZone': 'eu-central-1b'},
    'UserData' : user_data
    }

resource = boto3.resource('ec2')
instance = resource.create_instances(**instance_spec, MinCount=1, MaxCount=1)[0]

当我运行此代码(对于p3.2xlarge实例类似)并检查系统日志时,该过程似乎卡在了“卸载tensorflow”上...

...  
Starting crond: [  OK  ]  
Starting atd: [  OK  ]  
Starting cgconfig service: [  OK  ]  
Starting docker:        .[  OK  ]  
Starting cloud-init: [  123.181381] bridge: filtering via arp/ip/ip6tables is no  longer available by default. Update your scripts to load br_netfilter if you need this.  
[  123.209201] Bridge firewalling registered
Cloud-init v. 0.7.6 running 'modules:final' at Mon, 18 Mar 2019 02:13:37 +0000. Up 123.24 seconds.  
[  123.499776] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)  
[  124.310613] ip_tables: (C) 2000-2006 Netfilter Core Team  
[  125.029313] Initializing XFRM netlink socket  
[  125.038833] Netfilter messages via NETLINK v0.30.  
[  125.045449] ctnetlink v0.93: registering with nfnetlink.  
[  125.102911] IPv6: ADDRCONF(NETDEV_UP): docker0: link is not ready  
WARNING: First activation might take some time (1+ min).  
Installing TensorFlow optimized for your Amazon EC2 instance......  
Env where framework will be re-installed: tensorflow_p36  
Uninstalling tensorflow-gpu-1.12.0:  

我在腻子会话中确认,如果从控制台执行该环境,则该环境的安装效果很好。

感谢您的帮助!

B。

0 个答案:

没有答案