以下代码给出了错误消息
EmrResponseError: EmrResponseError: 400 Bad Request <ErrorResponse xmlns="http://elasticmapreduce.amazonaws.com/doc/2009-03-31"> <Error>
<Type>Sender</Type>
<Code>ValidationError</Code>
<Message>Log Uri is not in the required format</Message> </Error> <RequestId>1c3d0221-4420-11e4-a09e-5113f30a0036</RequestId> </ErrorResponse>
无论我尝试使用哪种S3 URI。我已经尝试了一个尾随斜杠,s3n:// s3://和所有其他没有运气的组合。
以下是代码:
import boto.emr
conn = boto.emr.connect_to_region('us-east-1')
job_parameters = {"log_uri":"s3n://shadoop/logs/new-log",
"ec2_keyname":"XXXXXX",
"availability_zone":"us-east-1e",
"master_instance_type":"m1-medium",
"slave_instance_type":"m1-medium",
"num_instances":"2",
"keep_alive":"True",
"enable_debugging":"True",
"hadoop_version":"2.4.0",
"ami_version":"3.1.0",
"visible_to_all_users":"True"
}
jobid = conn.run_jobflow("test_cluster",job_parameters)
答案 0 :(得分:0)
为了让它运行,我必须单独传递输入变量而不是字典。以下代码有效:
jobid = conn.run_jobflow(cluster_name,log_uri="s3n://shadoop/logs/new-log",\
ec2_keyname="XXXXX",\
availability_zone="us-east-1e",\
instance_groups=instance_groups,\
num_instances=str(num_instances),\
keep_alive="True",\
enable_debugging="True",\
hadoop_version="2.4.0",\
ami_version="3.1.0",\
visible_to_all_users="True",\
steps=[hive_step],\
bootstrap_actions=[bootstrap_impala])