我尝试在Amazon EMR上运行mrjob脚本。当我使用实例c1.medium时它运行良好,但是当我将instnace更改为t2.micro时它出错了。完整的错误消息如下所示。
C:\ Users \ Administrator \ MyIpython> python word_count.py -r emr 111.txt 在C:\ Users \ Administrator.mrjob.conf中使用配置创建新的 刮铲mrjob-875a948553aab9e8使用 s3:// mrjob-875a948553aab9e8 / tmp /作为S3创建tmp的临时目录 目录c:\ users \ admini~1 \ appdata \ local \ temp \ word_count.Administr ator.20150731.013007.592000编写主引导脚本 C:\用户\ ADMINI〜1 \应用程序数据\本地\ TEMP \ word_cou nt.Administrator.20150731.013007.592000 \ b.py
请注意:从mrjob v0.5.0开始,协议将严格遵守 默认。建议你使用--strict-protocols或者运行你的工作 按照描述设置mrjob.conf https://pythonhosted.org/mrjob/whats-new.html#ready-for-strict-protoc 醇
创建S3存储桶'mrjob-875a948553aab9e8'以用作临时空间 将非输入文件复制到 S3://mrjob-875a948553aab9e8/tmp/word_count.Administ rator.20150731.013007.592000 / files /等待5.0s for S3 finalual 一致性创建Elastic MapReduce作业流回溯(最近的 最后调用):文件“word_count.py”,第16行,in MRWordFrequencyCount.run()文件“F:\ Program Files \ Anaconda \ lib \ site-packages \ mrjob \ job.py”,第461行,运行中 mr_job.execute()文件“F:\ Program Files \ Anaconda \ lib \ site-packages \ mrjob \ job.py”,第479行,执行中 super(MRJob,self).execute()文件“F:\ Program Files \ Anaconda \ lib \ site-packages \ mrjob \ launch.py”,第153行,in 执行 self.run_job()文件“F:\ Program Files \ Anaconda \ lib \ site-packages \ mrjob \ _ launch.py”,第216行,in run_job runner.run()文件“F:\ Program Files \ Anaconda \ lib \ site-packages \ mrjob \ runner.py”,第470行,在运行中 self._run()文件“F:\ Program Files \ Anaconda \ lib \ site-packages \ mrjob \ emr.py”,第881行,in _跑 self._launch()文件“F:\ Program Files \ Anaconda \ lib \ site-packages \ mrjob \ emr.py”,第886行,in _发射 self._launch_emr_job()文件“F:\ Program Files \ Anaconda \ lib \ site-packages \ mrjob \ emr.py”,第1593行,in _launch_emr_job persistent = False)文件“F:\ Program Files \ Anaconda \ lib \ site-packages \ mrjob \ emr.py”,第1327行,in _create_job_flow self._job_name,self._opts ['s3_log_uri'],** args)文件“F:\ Program Files \ Anaconda \ lib \ site-packages \ mrjob \ retry.py”,line 149,我是call_and_maybe_retry return f(* args,** kwargs)文件“F:\ Program Files \ Anaconda \ lib \ site-packages \ mrjob \ retry.py”,第71行,in call_and_maybe_retry result = getattr(alternative,name)(* args,** kwargs)文件“F:\ Program Files \ Anaconda \ lib \ site-packages \ boto \ emr \ connection.py”, lin e 581,在run_jobflow中 'RunJobFlow',params,RunJobFlowResponse,verb ='POST')文件“F:\ Program Files \ Anaconda \ lib \ site-packages \ boto \ connection.py”,line 12 08,在get_object中 raise self.ResponseError(response.status,response.reason,body)boto.exception.EmrResponseError:EmrResponseError:400 Bad Request
寄件人ValidationError
不支持实例类型't2.micro'c3ee1107-3723-11e5-8d8e-f1011298229d
这是我的配置文件详细信息
runners:
emr:
aws_access_key_id: xxxxxxxxxxx
aws_secret_access_key: xxxxxxxxxxxxx
aws_region: us-east-1
ec2_key_pair: EMR
ec2_key_pair_file: C:\Users\Administrator\EMR.pem
ssh_tunnel_to_job_tracker: false
ec2_instance_type: t2.micro
num_ec2_instances: 2