Hue使用AWS EMR和Boto安装bootstrap错误

时间:2014-11-11 21:30:53

标签: amazon-web-services boto emr hue

随着AMI 3.3.0的发布,AWS支持Hue作为可安装的应用程序"在EMR中,如Hive / Pig。使用EMR Web UI,使用Hue创建集群对我来说很好,但是当通过Boto添加Hue安装引导操作时,我收到一个非确定性错误(它会定期崩溃)。我已经测试了4次相同的配置,崩溃率为50%。

在Boto中,我添加了一个额外的引导操作,就像在启用Hue时从Web UI创建集群时自动完成的那样:

BootstrapAction('Install Hue', 's3://elasticmapreduce/libs/hue/install-hue', [])

然后群集终止于:

Terminated with errors: On the master instance (i-c6b7582a), 
bootstrap action 2 returned a non-zero return code

在引导操作日志中:

Existing lock /var/run/yum.pid: another copy is running as pid 2007. Another app is currently holding the yum lock; waiting for it to exit... The other application is: yum Memory : 22 M RSS (305 MB VSZ) Started: Tue Nov 11 21:00:12 2014 - 00:19 ago State : Sleeping, pid: 2007 Another app is currently holding the yum lock; waiting for it to exit...

大量的,最后是一个大型堆栈跟踪:

Trying other mirror.
http://packages.ap-southeast-2.amazonaws.com/2014.09/main/20140901f63e/x86_64/repodata/repomd.xml?instance_id=i-c6b7582a&region=us-east-1: [Errno 12] Timeout on http://packages.ap-southeast-2.amazonaws.com/2014.09/main/20140901f63e/x86_64/repodata/repomd.xml?instance_id=i-c6b7582a&region=us-east-1: (28, 'Connection timed out after 10000 milliseconds')
Trying other mirror.
Traceback (most recent call last):
  File "/usr/bin/yum", line 29, in <module>
    yummain.user_main(sys.argv[1:], exit_code=True)
  File "/usr/share/yum-cli/yummain.py", line 355, in user_main
    errcode = main(args)
  File "/usr/share/yum-cli/yummain.py", line 174, in main
    result, resultmsgs = base.doCommands()
  File "/usr/share/yum-cli/cli.py", line 572, in doCommands
    return self.yum_cli_commands[self.basecmd].doCommand(self, self.basecmd, self.extcmds)
  File "/usr/share/yum-cli/yumcommands.py", line 432, in doCommand
    return base.installPkgs(extcmds, basecmd=basecmd)
  File "/usr/share/yum-cli/cli.py", line 968, in installPkgs
    txmbrs = self.install(pattern=arg)
  File "/usr/lib/python2.6/site-packages/yum/__init__.py", line 4721, in install
    mypkgs = self.pkgSack.returnPackages(patterns=pats,
  File "/usr/lib/python2.6/site-packages/yum/__init__.py", line 1069, in <lambda>
    pkgSack = property(fget=lambda self: self._getSacks(),
  File "/usr/lib/python2.6/site-packages/yum/__init__.py", line 774, in _getSacks
    self.repos.populateSack(which=repos)
  File "/usr/lib/python2.6/site-packages/yum/repos.py", line 383, in populateSack
    sack.populate(repo, mdtype, callback, cacheonly)
  File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 250, in populate
    if self._check_db_version(repo, mydbtype):
  File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 342, in _check_db_version
    return repo._check_db_version(mdtype)
  File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 1520, in _check_db_version
    repoXML = self.repoXML
  File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 1706, in <lambda>
    repoXML = property(fget=lambda self: self._getRepoXML(),
  File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 1702, in _getRepoXML
    self._loadRepoXML(text=self.ui_id)
  File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 1693, in _loadRepoXML
    return self._groupLoadRepoXML(text, self._mdpolicy2mdtypes())
  File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 1667, in _groupLoadRepoXML
    if self._commonLoadRepoXML(text):
  File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 1495, in _commonLoadRepoXML
    self._revertOldRepoXML()
  File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 1345, in _revertOldRepoXML
    os.rename(old_data['old_local'], old_data['local'])
OSError: [Errno 2] No such file or directory

相比之下,引导日志在成功时显示了一行:

Warning: RPMDB altered outside of yum.

1 个答案:

答案 0 :(得分:0)

在EMR AMI 3.3中安装和运行Hue的示例

import boto.emr
from boto.emr.emrobject import InstanceGroup
from boto.emr.bootstrap_action import BootstrapAction
from boto.emr.step import ScriptRunnerStep

conn = boto.emr.EmrConnection()

jobid = conn.run_jobflow(name="Hue Example", ami_version = "3.3.0",
                                log_uri="s3n://your-log-path-here",
                                instance_groups= get_instance_groups(),
                                bootstrap_actions=get_bootstrap_actions(),
                                ec2_keyname="your-ec2-key-name",
                                steps = get_startup_steps()
                                )

def get_bootstrap_actions():
    install_hue_action = BootstrapAction("Install Hue ",
                                "s3n://us-east-1.elasticmapreduce/libs/hue/install-hue",
                                bootstrap_action_args=None)
    return [install_hue_action]


def get_startup_steps():
    runHueStep = ScriptRunnerStep(name="Run Hue",
                                        step_args = ["s3n://us-east-1.elasticmapreduce/libs/hue/run-hue"])
    return [runHueStep]


def get_instance_groups():
    #This is just an example. Actual implementation will have core, and task instance groups as well. Please choose your instance type, number, and bid price wisely as might it get too expensive too quickly.
    spotInstanceGroup =  InstanceGroup()
    spotInstanceGroup.name="Spot Instance Group Master"
    spotInstanceGroup.bidprice="0.20"
    spotInstanceGroup.num_instances = 1
    spotInstanceGroup.market="SPOT"
    spotInstanceGroup.type="c3.2xlarge"
    spotInstanceGroup.role="MASTER"