Question

我试图从这里运行Apache MapReduce 2.7的基本WordCount示例：

https://hadoop.apache.org/docs/r2.7.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html#Example:_WordCount_v1.0

我将输入文件放在：/ user / hadoopLearning / WordCount / input / 输出路径：/ user / hadoopLearning / WordCount / output /

然后我运行了以下命令：

 hadoop jar wc.jar WordCount /user/hadoopLearning/WordCount/input/file01  /user/hadoopLearning/WordCount/output

但是在运行时我收到以下错误：

Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException: **Output directory** hdfs://sandbox.hortonworks.com:8020/user/hadoopLearning/WordCount/**input**/file01 already exists

我没有编写任何代码，并从Apache的网站上面复制了上面的所有内容。

我理解错误，但如果我们仔细查看错误，它说输出目录已经存在，并且在堆栈跟踪中它给出了输入目录的路径。

任何人都可以帮助我。我是hadoop领域的初学者。提前谢谢。

Answer 1

您正在尝试创建已存在的文件。

HDFS不允许这样做。

用其他东西替换你的输出路径（＆＃39; / user / hadoopLearning / WordCount / output＆＃39;）。

尝试此命令

       hadoop jar wc.jar WordCount /user/hadoopLearning/WordCount/input/file01  /user/hadoopLearning/WordCount/new_output_path

MapReduce WordCount示例问题

1 个答案: