即使是从我的AWS边缘节点提交的非常基本的Hadoop Streaming作业:
hadoop jar /usr/share/hadoop/contrib/streaming/hadoop-streaming-1.0.4.jar \
-D mapred.job.name=MarksClusterTest \
-file /mnt/home/mpundurs/Clustering/passthrough.py \
-mapper /mnt/home/mpundurs/Clustering/passthrough.py \
-file /mnt/home/mpundurs/Clustering/passthrough.py \
-reducer /mnt/home/mpundurs/Clustering/passthrough.py \
-input /user/mpundurs/Clustering/input.csv \
-output /user/mpundurs/Clustering/output
在将我返回到命令提示符之前,仅将以下内容发送到我的屏幕:
packageJobJar: [/mnt/home/mpundurs/Clustering/passthrough.py,
/mnt/home/mpundurs/Clustering/passthrough.py,
/mnt/tmp/hadoop-mpundurs/hadoop-unjar523304178423152265/] []
/tmp/streamjob8344547788966317309.jar tmpDir=null
在提交的任何map-reduce作业的http:// HeadNodeIP :9100 / jobtracker.jsp上没有记录。
Hadoop Streaming本身 - 而不是它提交(或不提交)的工作 - 创建日志吗?如果是这样,在哪里?