我尝试通过userdata.txt在AWS Deep Learning AMI中安装/激活虚拟环境,但是该过程似乎陷入了困境。有人可以帮忙吗?
import boto3
user_data = u"""#!/bin/bash
cd /home/ec2-user
echo cd /home/ec2-user > test.sh
echo source activate tensorflow_p36 >> test.sh
echo "Made it 1" >> test.sh
su ec2-user -c "bash test.sh" -l
echo "Made it 2"
"""
# Spec throwaway instance
instance_spec = {
'ImageId' : 'ami-055ab192b68ca4d2f', # DLAMI Conda
'InstanceType' : 't2.small',
'KeyName': 'XXX',
'SecurityGroupIds': ['XXX'],
'Placement': {'AvailabilityZone': 'eu-central-1b'},
'UserData' : user_data
}
resource = boto3.resource('ec2')
instance = resource.create_instances(**instance_spec, MinCount=1, MaxCount=1)[0]
当我运行此代码(对于p3.2xlarge实例类似)并检查系统日志时,该过程似乎卡在了“卸载tensorflow”上...
...
Starting crond: [ OK ]
Starting atd: [ OK ]
Starting cgconfig service: [ OK ]
Starting docker: .[ OK ]
Starting cloud-init: [ 123.181381] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
[ 123.209201] Bridge firewalling registered
Cloud-init v. 0.7.6 running 'modules:final' at Mon, 18 Mar 2019 02:13:37 +0000. Up 123.24 seconds.
[ 123.499776] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
[ 124.310613] ip_tables: (C) 2000-2006 Netfilter Core Team
[ 125.029313] Initializing XFRM netlink socket
[ 125.038833] Netfilter messages via NETLINK v0.30.
[ 125.045449] ctnetlink v0.93: registering with nfnetlink.
[ 125.102911] IPv6: ADDRCONF(NETDEV_UP): docker0: link is not ready
WARNING: First activation might take some time (1+ min).
Installing TensorFlow optimized for your Amazon EC2 instance......
Env where framework will be re-installed: tensorflow_p36
Uninstalling tensorflow-gpu-1.12.0:
我在腻子会话中确认,如果从控制台执行该环境,则该环境的安装效果很好。
感谢您的帮助!
B。