s3distcp无法从HDFS复制到S3

时间:2017-06-27 15:50:31

标签: amazon-web-services amazon-s3 emr amazon-emr s3distcp

我尝试将csv文件从HDFS复制到S3,但作业因这些错误而失败:

Error: java.lang.RuntimeException: Reducer task failed to copy 1 files: 
hdfs://<ip>.ec2.internal:8020/output/data.csv/part-00000-5a0c6bcc-48eb-4390-9d14-13a2f7a4408b.csv etc
Error: java.lang.RuntimeException: Reducer task failed to copy 1 files: hdfs://<ip>.ec2.internal:8020/output/data.csv/_SUCCESS etc
Error: java.lang.RuntimeException: Reducer task failed to copy 1 files: hdfs://<ip>.ec2.internal:8020/output/data.csv/part-00000-5a0c6bcc-48eb-4390-9d14-13a2f7a4408b.csv etc
Error: java.lang.RuntimeException: Reducer task failed to copy 1 files: hdfs://<ip>.ec2.internal:8020/output/data.csv/_SUCCESS etc
Error: java.lang.RuntimeException: Reducer task failed to copy 1 files: hdfs://<ip>.ec2.internal:8020/output/data.csv/part-00000-5a0c6bcc-48eb-4390-9d14-13a2f7a4408b.csv etc
Error: java.lang.RuntimeException: Reducer task failed to copy 1 files: hdfs://<ip>.ec2.internal:8020/output/data.csv/_SUCCESS etc
Exception in thread "main" java.lang.RuntimeException: Error running job
Caused by: java.io.IOException: Job failed!

我试过增加内存并将工人数设置为1,我的论点如下:

-D s3DistCp.copyfiles.mapper.numWorkers=1 -D mapred.child.java.opts=-Xmx1024m --src=hdfs:///output/data.csv/ --dest=s3://<bucket>/<directory>/data.csv/

我还确保EMR角色具有完整的S3访问权限。有关如何解决此错误的任何建议吗?

0 个答案:

没有答案