Question

我正在使用c#.net来编写地图并减少函数。我基本上遵循了给出的示例here

最终命令

Hadoop jar hadoop-streaming.jar -files“hdfs：///example/apps/map.exe,hdfs：///example/apps/reduce.exe”-input“/ example / apps / data.csv“-output”/example/apps/output.txt“-mapper”map.exe“-reducer”reduce.exe“

作业成功运行 enter image description here

现在从Interactive JS模式，如果我写

js> #cat /example/apps/output.txt

cat：文件不存在：/ example /apps / output.txt

其中：

js> #ls /example/apps/output.txt

找到3项

-rw-r--r-- 3 xxxx supergroup 0 2013-02-22 10:23 /example/apps/output.txt/_SUCCESS

drwxr-xr-x - xxxx supergroup 0 2013-02-22 10:22 /example/apps/output.txt/_logs

-rw-r--r-- 3 xxxx supergroup 0 2013-02-22 10:23 /example/apps/output.txt/part-00000

我犯的错误是什么？我怎样才能看到输出？

Answer 1

-output标志指定输出文件夹，而不是文件。由于可以有多个reducer，每个reducers将在此文件夹中生成一个文件。

在这种情况下，您有一个reducer，它生成了一个文件：part-00000。如果还有更多缩减器，则会将其命名为part-00001，part-00002等。

命令cat /example/apps/output.txt/part-00000将显示您的输出。将来，请不要将输出文件夹命名为something.txt，因为这会让您和其他人感到困惑：）

使用MapReduce中的Windows Azure存储执行Reduce功能后无法查看最终结果

1 个答案: