随着AMI 3.3.0的发布,AWS支持Hue作为可安装的应用程序"在EMR中,如Hive / Pig。使用EMR Web UI,使用Hue创建集群对我来说很好,但是当通过Boto添加Hue安装引导操作时,我收到一个非确定性错误(它会定期崩溃)。我已经测试了4次相同的配置,崩溃率为50%。
在Boto中,我添加了一个额外的引导操作,就像在启用Hue时从Web UI创建集群时自动完成的那样:
BootstrapAction('Install Hue', 's3://elasticmapreduce/libs/hue/install-hue', [])
然后群集终止于:
Terminated with errors: On the master instance (i-c6b7582a),
bootstrap action 2 returned a non-zero return code
在引导操作日志中:
Existing lock /var/run/yum.pid: another copy is running as pid 2007.
Another app is currently holding the yum lock; waiting for it to exit...
The other application is: yum
Memory : 22 M RSS (305 MB VSZ)
Started: Tue Nov 11 21:00:12 2014 - 00:19 ago
State : Sleeping, pid: 2007
Another app is currently holding the yum lock; waiting for it to exit...
大量的,最后是一个大型堆栈跟踪:
Trying other mirror.
http://packages.ap-southeast-2.amazonaws.com/2014.09/main/20140901f63e/x86_64/repodata/repomd.xml?instance_id=i-c6b7582a®ion=us-east-1: [Errno 12] Timeout on http://packages.ap-southeast-2.amazonaws.com/2014.09/main/20140901f63e/x86_64/repodata/repomd.xml?instance_id=i-c6b7582a®ion=us-east-1: (28, 'Connection timed out after 10000 milliseconds')
Trying other mirror.
Traceback (most recent call last):
File "/usr/bin/yum", line 29, in <module>
yummain.user_main(sys.argv[1:], exit_code=True)
File "/usr/share/yum-cli/yummain.py", line 355, in user_main
errcode = main(args)
File "/usr/share/yum-cli/yummain.py", line 174, in main
result, resultmsgs = base.doCommands()
File "/usr/share/yum-cli/cli.py", line 572, in doCommands
return self.yum_cli_commands[self.basecmd].doCommand(self, self.basecmd, self.extcmds)
File "/usr/share/yum-cli/yumcommands.py", line 432, in doCommand
return base.installPkgs(extcmds, basecmd=basecmd)
File "/usr/share/yum-cli/cli.py", line 968, in installPkgs
txmbrs = self.install(pattern=arg)
File "/usr/lib/python2.6/site-packages/yum/__init__.py", line 4721, in install
mypkgs = self.pkgSack.returnPackages(patterns=pats,
File "/usr/lib/python2.6/site-packages/yum/__init__.py", line 1069, in <lambda>
pkgSack = property(fget=lambda self: self._getSacks(),
File "/usr/lib/python2.6/site-packages/yum/__init__.py", line 774, in _getSacks
self.repos.populateSack(which=repos)
File "/usr/lib/python2.6/site-packages/yum/repos.py", line 383, in populateSack
sack.populate(repo, mdtype, callback, cacheonly)
File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 250, in populate
if self._check_db_version(repo, mydbtype):
File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 342, in _check_db_version
return repo._check_db_version(mdtype)
File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 1520, in _check_db_version
repoXML = self.repoXML
File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 1706, in <lambda>
repoXML = property(fget=lambda self: self._getRepoXML(),
File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 1702, in _getRepoXML
self._loadRepoXML(text=self.ui_id)
File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 1693, in _loadRepoXML
return self._groupLoadRepoXML(text, self._mdpolicy2mdtypes())
File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 1667, in _groupLoadRepoXML
if self._commonLoadRepoXML(text):
File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 1495, in _commonLoadRepoXML
self._revertOldRepoXML()
File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 1345, in _revertOldRepoXML
os.rename(old_data['old_local'], old_data['local'])
OSError: [Errno 2] No such file or directory
相比之下,引导日志在成功时显示了一行:
Warning: RPMDB altered outside of yum.
答案 0 :(得分:0)
在EMR AMI 3.3中安装和运行Hue的示例
import boto.emr
from boto.emr.emrobject import InstanceGroup
from boto.emr.bootstrap_action import BootstrapAction
from boto.emr.step import ScriptRunnerStep
conn = boto.emr.EmrConnection()
jobid = conn.run_jobflow(name="Hue Example", ami_version = "3.3.0",
log_uri="s3n://your-log-path-here",
instance_groups= get_instance_groups(),
bootstrap_actions=get_bootstrap_actions(),
ec2_keyname="your-ec2-key-name",
steps = get_startup_steps()
)
def get_bootstrap_actions():
install_hue_action = BootstrapAction("Install Hue ",
"s3n://us-east-1.elasticmapreduce/libs/hue/install-hue",
bootstrap_action_args=None)
return [install_hue_action]
def get_startup_steps():
runHueStep = ScriptRunnerStep(name="Run Hue",
step_args = ["s3n://us-east-1.elasticmapreduce/libs/hue/run-hue"])
return [runHueStep]
def get_instance_groups():
#This is just an example. Actual implementation will have core, and task instance groups as well. Please choose your instance type, number, and bid price wisely as might it get too expensive too quickly.
spotInstanceGroup = InstanceGroup()
spotInstanceGroup.name="Spot Instance Group Master"
spotInstanceGroup.bidprice="0.20"
spotInstanceGroup.num_instances = 1
spotInstanceGroup.market="SPOT"
spotInstanceGroup.type="c3.2xlarge"
spotInstanceGroup.role="MASTER"