我编写了一个hadoop 1.0.4应用程序,它在半分布式模式下本地运行良好。我还在我的集群上安装了Cloudera Hadoop 4。我认为CDH4运行hadoop 1.0.4,因为它在hadoop站点上被列为稳定,但似乎并非如此。当我在我的集群上运行应用程序时,我收到以下错误:
12/11/27 16:14:38 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
12/11/27 16:14:38 INFO input.FileInputFormat: Total input paths to process : 16
12/11/27 16:14:39 INFO mapred.JobClient: Running job: job_201211271520_0004
12/11/27 16:14:40 INFO mapred.JobClient: map 0% reduce 0%
12/11/27 16:14:50 INFO mapred.JobClient: Task Id : attempt_201211271520_0004_m_000013_0, Status : FAILED
Error: Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected
12/11/27 16:14:50 INFO mapred.JobClient: Task Id : attempt_201211271520_0004_m_000000_0, Status : FAILED
... and so on...
我是否正确地认为这是因为CHD4与hadoop 1.0.4不兼容?如果是这样,有谁知道哪个版本与hadoop 1.0.4兼容?我宁愿切换cloudera软件而不是重写我的应用程序。
答案 0 :(得分:3)
你是对的; CDH3使用版本0.20.2,CDH4使用版本2.0.0。 Hadoop版本的命名是一团糟,我不会假装理解它。但看起来您可能能够根据this blog post by Cloudera中的以下内容使用CDH3:
"The CDH3 distribution incorporated the 0.20.2 Apache Hadoop release plus the features of the 0.20.append and 0.20.security branches that collectively are now known as “1.0.” The Apache Hadoop in CDH3 has been the equivalent of the recently announced Apache Hadoop 1.0 for approximately a year now."
如果是这种情况,我会尝试CDH3。如果它不起作用,你可能只需要寻找除Cloudera安装之外的东西。