在emr上运行mrjob脚本的ssh密钥无效

时间:2014-06-03 18:24:28

标签: amazon-web-services ssh emr mrjob

我正在讨论如何让mrjob在EMR上工作guide。我按照所有步骤操作,但是当我运行示例脚本时,我收到此错误:

matthew@WinterMute:~/work/projects/mrjob_examples$ python word_count.py -r emr moby.txt
using configs in /etc/mrjob.conf
using existing scratch bucket mrjob-4db6342a70e021ad
using s3://mrjob-4db6342a70e021ad/tmp/ as our scratch dir on S3
creating tmp directory /tmp/word_count.matthew.20140603.181541.006786
writing master bootstrap script to /tmp/word_count.matthew.20140603.181541.006786/b.py
Copying non-input files into s3://mrjob-4db6342a70e021ad/tmp/word_count.matthew.20140603.181541.006786/files/
Waiting 5.0s for S3 eventual consistency
Creating Elastic MapReduce job flow
Job flow created with ID: j-3DCN7LULSRILW
Created new job flow j-3DCN7LULSRILW
Job on job flow j-3DCN7LULSRILW failed with status FAILED: The given SSH key name was invalid
Logs are in s3://mrjob-4db6342a70e021ad/tmp/logs/j-3DCN7LULSRILW/
Scanning S3 logs for probable cause of failure
Waiting 5.0s for S3 eventual consistency
Terminating job flow: j-3DCN7LULSRILW
Traceback (most recent call last):
  File "word_count.py", line 16, in <module>
    MRWordFrequencyCount.run()
  File "/usr/local/lib/python2.7/dist-packages/mrjob/job.py", line 494, in run
    mr_job.execute()
  File "/usr/local/lib/python2.7/dist-packages/mrjob/job.py", line 512, in execute
    super(MRJob, self).execute()
  File "/usr/local/lib/python2.7/dist-packages/mrjob/launch.py", line 147, in execute
    self.run_job()
  File "/usr/local/lib/python2.7/dist-packages/mrjob/launch.py", line 208, in run_job
    runner.run()
  File "/usr/local/lib/python2.7/dist-packages/mrjob/runner.py", line 458, in run
    self._run()
  File "/usr/local/lib/python2.7/dist-packages/mrjob/emr.py", line 809, in _run
    self._wait_for_job_to_complete()
  File "/usr/local/lib/python2.7/dist-packages/mrjob/emr.py", line 1599, in _wait_for_job_to_complete
    raise Exception(msg)
Exception: Job on job flow j-3DCN7LULSRILW failed with status FAILED: The given SSH key name was invalid

2 个答案:

答案 0 :(得分:0)

你的工作似乎开始很好,但是mrjob无法ssh到主节点以监控它的状态。通过查看配置文件(主要是ec2_key_pair_fileec2_key_pair选项)很难确定错误设置的确切原因。请务必遵循Configuring AWS credentials指南。您必须指定有效的密钥对名称(在“密钥对”部分下签入EC2管理仪表板)和相应.pem文件的路径。

答案 1 :(得分:0)

我自己搜索错误时发现了这个问题。

我设法解决了这个问题 - SSH密钥是特定于区域的,因此您需要将mrjob.conf文件中的区域设置为SSH密钥所属的区域:

runners:
    emr:
        aws_access_key_id: HADOOPHADOOPBOBADOOP
        aws_region: us-west-1
        aws_secret_access_key: MEMIMOMADOOPBANANAFANAFOFADOOPHADOOP

见这里:https://pythonhosted.org/mrjob/guides/configs-basics.html