运行EMR示例,获取301错误

时间:2016-08-02 01:51:27

标签: amazon-web-services hadoop emr amazon-emr

我正在尝试运行示例hadoop-streaming命令:

hadoop-streaming -files streamingCode/wordSplitter.py \
-mapper wordSplitter.py \
-input s3://elasticmapreduce/samples/wordcount/input \
-output streamingCode/wordCountOut \
-reducer aggregate

但我一直收到这个错误:

Exception in thread "main" com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: Moved Permanently (Service: Amazon S3; Status Code: 301; Error Code: 301 Moved Permanently; Request ID: 98038E504E150CEC), S3 Extended Request ID: IW1x5otBSepAnPgW/RKELCUI9dhADQvrXqU2Ase1CLIa0SWDFnBbTscXihrvHvNm2ZRxjjSJZ1Q=

我认为这是因为我的群集位于us-west-2,但我无法弄清楚如何正确格式化s3网址(或者根本不是问题) 。

编辑:将其更改为以下网址后:

s3://s3-us-west-2.amazonaws.com/elasticmapreduce/samples/wordcount/input

我现在收到以下错误:

Exception in thread "main" com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS3
Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: BC8DB415C780DF84), 
           S3 Extended Request ID: sx8W/+gvND2ssqQce9ZQsZTiqxmSJYZs8OiXgrjwL3dm0JRPaC7ceHor+yrHsPuKTjM2LUwkRAw=

编辑:所以我确认错误确实是因为我的群集位于us-west-2,我在us-east-1中创建了一个群集并且它正常工作。那么,问题是如何从其他地区访问s3存储桶?这甚至可能吗?

1 个答案:

答案 0 :(得分:1)

亚马逊更改了启动emr-4.7.0的默认行为,这在我们升级EMR版本时导致了此错误。

解决方案很简单,将此配置添加到核心站点: fs.s3n.endpoint = s3.amazonaws.com