您好我正在尝试将我的数据从具有CDH4.3的群集移动到具有CDH4.5的群集。 我正在执行以下命令。
hadoop distcp -update hftp://server1:50070/hbase/test/x hdfs://server2:8020/copy/
执行后我收到以下错误:
14/01/28 19:42:43 INFO tools.DistCp: srcPaths=[hftp://server1:50070/hbase/test/x]
14/01/28 19:42:43 INFO tools.DistCp: destPath=hdfs://server2:8020/copy
14/01/28 19:42:45 INFO tools.DistCp: sourcePathsCount=1
14/01/28 19:42:45 INFO tools.DistCp: filesToCopyCount=1
14/01/28 19:42:45 INFO tools.DistCp: bytesToCopyCount=1
14/01/28 19:42:46 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
14/01/28 19:42:47 INFO mapred.JobClient: Running job: job_201401101918_0008
14/01/28 19:42:48 INFO mapred.JobClient: map 0% reduce 0%
14/01/28 19:43:05 INFO mapred.JobClient: map 100% reduce 0%
14/01/28 19:43:07 INFO mapred.JobClient: Task Id : attempt_201401101918_0008_m_000000_0, Status : FAILED
14/01/28 19:43:08 INFO mapred.JobClient: map 0% reduce 0%
14/01/28 19:43:19 INFO mapred.JobClient: map 100% reduce 0%
14/01/28 19:43:22 INFO mapred.JobClient: Task Id : attempt_201401101918_0008_m_000000_1, Status : FAILED
java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
at org.apache.hadoop.tools.DistCp$CopyFilesMapper.close(DistCp.java:582)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
14/01/28 19:43:23 INFO mapred.JobClient: map 0% reduce 0%
14/01/28 19:43:33 INFO mapred.JobClient: map 100% reduce 0%
14/01/28 19:43:35 INFO mapred.JobClient: Task Id : attempt_201401101918_0008_m_000000_2, Status : FAILED
java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
at org.apache.hadoop.tools.DistCp$CopyFilesMapper.close(DistCp.java:582)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
14/01/28 19:43:36 INFO mapred.JobClient: map 0% reduce 0%
14/01/28 19:43:46 INFO mapred.JobClient: map 100% reduce 0%
14/01/28 19:43:50 INFO mapred.JobClient: map 0% reduce 0%
14/01/28 19:43:53 INFO mapred.JobClient: Job complete: job_201401101918_0008
14/01/28 19:43:53 INFO mapred.JobClient: Counters: 6
14/01/28 19:43:53 INFO mapred.JobClient: Job Counters
14/01/28 19:43:53 INFO mapred.JobClient: Failed map tasks=1
14/01/28 19:43:53 INFO mapred.JobClient: Launched map tasks=4
14/01/28 19:43:53 INFO mapred.JobClient: Total time spent by all maps in occupied slots (ms)=64095
14/01/28 19:43:53 INFO mapred.JobClient: Total time spent by all reduces in occupied slots (ms)=0
14/01/28 19:43:53 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
14/01/28 19:43:53 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
14/01/28 19:43:53 INFO mapred.JobClient: Job Failed: NA
With failures, global counters are inaccurate; consider running with -i
Copy failed: java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1388)
at org.apache.hadoop.tools.DistCp.copy(DistCp.java:667)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
You have new mail in /var/spool/mail/root
[hdfs@sdl1039 root]$ hadoop distcp -update hftp://server1:50070/hbase/test/x hdfs://server2:8020/copy hadoop distcp -update hftp://server1:50070/hbase/test/x hdfs://server2:8020/copy
14/01/28 19:46:09 INFO tools.DistCp: srcPaths=[hftp://server1:50070/hbase/test/x, hdfs://server2:8020/copy, hadoop, distcp, hftp://server1:50070/hbase/test/x]
14/01/28 19:46:09 INFO tools.DistCp: destPath=hdfs://server2:8020/copy
With failures, global counters are inaccurate; consider running with -i
Copy failed: org.apache.hadoop.mapred.InvalidInputException: Input source hadoop does not exist.
Input source distcp does not exist.
at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:641)
at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
请指导我哪里出错了。
答案 0 :(得分:1)
我现在有了一个解决方案
hadoop distcp -update hdfs://server1:8020/hbase/test/x hdfs://server2:8020/copy/
但绝对想知道为什么http对我不起作用。
答案 1 :(得分:-1)
我认为hftp的端口号错误。 50070是namenode web ui的默认端口。
尝试:
hadoop distcp -update hftp://server1/hbase/test/x hdfs://server2:8020/copy/