如标题中所述,当我执行我的Hadoop程序(并在本地模式下调试)时,会发生以下情况:
的 1。我的测试数据中的所有10个csv行都在Mapper,Partitioner和在map-step之后调用的RawComperator(OutputKeyComparatorClass)中正确处理。但是之后不会执行OutputValueGroupingComparatorClass和ReduceClass的函数。
的 2。我的应用程序如下所示。 (由于空间限制,我省略了我用作配置参数的类的实现,直到有人有想法,涉及到它们):
public class RetweetApplication {
public static int DEBUG = 1;
static String INPUT = "/home/ema/INPUT-H";
static String OUTPUT = "/home/ema/OUTPUT-H "+ (new Date()).toString();
public static void main(String[] args) {
JobClient client = new JobClient();
JobConf conf = new JobConf(RetweetApplication.class);
if(DEBUG > 0){
conf.set("mapred.job.tracker", "local");
conf.set("fs.default.name", "file:///");
conf.set("dfs.replication", "1");
}
FileInputFormat.setInputPaths(conf, new Path(INPUT));
FileOutputFormat.setOutputPath(conf, new Path(OUTPUT));
//conf.setOutputKeyClass(Text.class);
//conf.setOutputValueClass(Text.class);
conf.setMapOutputKeyClass(Text.class);
conf.setMapOutputValueClass(Text.class);
conf.setMapperClass(RetweetMapper.class);
conf.setPartitionerClass(TweetPartitioner.class);
conf.setOutputKeyComparatorClass(TwitterValueGroupingComparator.class);
conf.setOutputValueGroupingComparator(TwitterKeyGroupingComparator.class);
conf.setReducerClass(RetweetReducer.class);
conf.setOutputFormat(TextOutputFormat.class);
client.setConf(conf);
try {
JobClient.runJob(conf);
} catch (Exception e) {
e.printStackTrace();
}
}
}
第3。我得到以下控制台输出(抱歉格式,但不知何故,这个日志没有正确格式化):
12/05/22 03:51:05 INFO mapred.MapTask:io.sort.mb = 100 12/05/22 03:51:05 INFO mapred.MapTask:data buffer = 79691776/99614720
12/05/22 03:51:05 INFO mapred.MapTask:record buffer = 262144/327680
12/05/22 03:51:06 INFO mapred.JobClient:map 0%reduce 0%
12/05/22 03:51:11 INFO mapred.LocalJobRunner: file:/ home / ema / INPUT-H / tweets:0 + 967 12/05/22 03:51:12 INFO mapred.JobClient:map 39%reduce 0%
12/05/22 03:51:14 INFO mapred.LocalJobRunner: file:/ home / ema / INPUT-H / tweets:0 + 967 12/05/22 03:51:15 INFO mapred.MapTask:启动地图输出的刷新
12/05/22 03:51:15 INFO mapred.MapTask:已完成泄漏0
12/05/22 03:51:15 INFO mapred.Task:任务:attempt_local_0001_m_000000_0 已经完成了。并且正在提交
12/05/22 03:51:15 INFO mapred.JobClient:地图79%减少0%
12/05/22 03:51:17 INFO mapred.LocalJobRunner: file:/ home / ema / INPUT-H / tweets:0 + 967
12/05/22 03:51:17 INFO mapred.LocalJobRunner: file:/ home / ema / INPUT-H / tweets:0 + 967
12/05/22 03:51:17 INFO mapred.Task:任务 ' attempt_local_0001_m_000000_0'完成。
12/05/22 03:51:17 INFO mapred.Task:使用ResourceCalculatorPlugin: org.apache.hadoop.util.LinuxResourceCalculatorPlugin@35eed0
12/05/22 03:51:17 INFO mapred.ReduceTask:ShuffleRamManager: MemoryLimit = 709551680,MaxSingleShuffleLimit = 177387920
12/05/22 03:51:17 INFO mapred.ReduceTask: attempt_local_0001_r_000000_0线程已启动:用于合并的线程 磁盘文件
12/05/22 03:51:17 INFO mapred.ReduceTask: attempt_local_0001_r_000000_0线程等待:用于合并的线程 磁盘文件
12/05/22 03:51:17 INFO mapred.ReduceTask: attempt_local_0001_r_000000_0线程已启动:用于合并的线程 记忆文件
12/05/22 03:51:17 INFO mapred.ReduceTask:attempt_local_0001_r_000000_0需要另外1个地图输出,其中0是 已经在进行中12/05/22 03:51:17 INFO mapred.ReduceTask: attempt_local_0001_r_000000_0预定0输出(0慢主机和0 复制主持人)
12/05/22 03:51:17 INFO mapred.ReduceTask: attempt_local_0001_r_000000_0线程已启动:用于轮询Map的线程 完成活动
12/05/22 03:51:18 INFO mapred.JobClient:map 100%reduce 0%12/05/22 03:51:23 INFO mapred.LocalJobRunner:reduce>复制>
从这一点开始,粗体标记的线条无休止地重复。
的 4。在映射器看到每个元组后,很多开放进程都处于活动状态:
RetweetApplication (1) [Remote Java Application]
OpenJDK Client VM[localhost:5002]
Thread [main] (Running)
Thread [Thread-2] (Running)
Daemon Thread [communication thread] (Running)
Thread [MapOutputCopier attempt_local_0001_r_000000_0.0] (Running)
Thread [MapOutputCopier attempt_local_0001_r_000000_0.1] (Running)
Thread [MapOutputCopier attempt_local_0001_r_000000_0.2] (Running)
Thread [MapOutputCopier attempt_local_0001_r_000000_0.4] (Running)
Thread [MapOutputCopier attempt_local_0001_r_000000_0.3] (Running)
Daemon Thread [Thread for merging on-disk files] (Running)
Daemon Thread [Thread for merging in memory files] (Running)
Daemon Thread [Thread for polling Map Completion Events] (Running)
有什么理由,为什么Hadoop期望来自映射器的更多输出(参见日志中的粗体标记行)比输入目录中的更多?如前所述,我调试了所有输入都在mapper / partitioner / etc中正确处理。
更新
在Chris的帮助下(见评论)我发现,我的程序没有像我预期的那样在localMode中启动:isLocal
类中的ReduceTask
变量设置为false
,虽然它应该是true
。
对我来说,绝对不清楚为什么会发生这种情况,因为必须设置为启用独立模式的3个选项才是正确的。令人惊讶的是:local
设置被忽略,"读取正常光盘"设置wasnt,这是非常奇怪的imho,因为我认为local
模式和file:///
协议是耦合的。
在调试ReduceTask
期间,我通过在调试视图中评估isLocal
将isLocal=true
变量设置为true,然后尝试执行程序的其余部分。它没有成功,这是堆栈跟踪:
12/05/22 14:28:28 INFO mapred.LocalJobRunner:
12/05/22 14:28:28 INFO mapred.Merger: Merging 1 sorted segments
12/05/22 14:28:28 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 1956 bytes
12/05/22 14:28:28 INFO mapred.LocalJobRunner:
12/05/22 14:28:29 WARN conf.Configuration: file:/tmp/hadoop-ema/mapred/local/localRunner/job_local_0001.xml:a attempt to override final parameter: fs.default.name; Ignoring.
12/05/22 14:28:29 WARN conf.Configuration: file:/tmp/hadoop-ema/mapred/local/localRunner/job_local_0001.xml:a attempt to override final parameter: mapred.job.tracker; Ignoring.
12/05/22 14:28:30 INFO ipc.Client: Retrying connect to server: master/127.0.0.1:9001. Already tried 0 time(s).
12/05/22 14:28:31 INFO ipc.Client: Retrying connect to server: master/127.0.0.1:9001. Already tried 1 time(s).
12/05/22 14:28:32 INFO ipc.Client: Retrying connect to server: master/127.0.0.1:9001. Already tried 2 time(s).
12/05/22 14:28:33 INFO ipc.Client: Retrying connect to server: master/127.0.0.1:9001. Already tried 3 time(s).
12/05/22 14:28:34 INFO ipc.Client: Retrying connect to server: master/127.0.0.1:9001. Already tried 4 time(s).
12/05/22 14:28:35 INFO ipc.Client: Retrying connect to server: master/127.0.0.1:9001. Already tried 5 time(s).
12/05/22 14:28:36 INFO ipc.Client: Retrying connect to server: master/127.0.0.1:9001. Already tried 6 time(s).
12/05/22 14:28:37 INFO ipc.Client: Retrying connect to server: master/127.0.0.1:9001. Already tried 7 time(s).
12/05/22 14:28:38 INFO ipc.Client: Retrying connect to server: master/127.0.0.1:9001. Already tried 8 time(s).
12/05/22 14:28:39 INFO ipc.Client: Retrying connect to server: master/127.0.0.1:9001. Already tried 9 time(s).
12/05/22 14:28:39 WARN conf.Configuration: file:/tmp/hadoop-ema/mapred/local/localRunner/job_local_0001.xml:a attempt to override final parameter: fs.default.name; Ignoring.
12/05/22 14:28:39 WARN conf.Configuration: file:/tmp/hadoop-ema/mapred/local/localRunner/job_local_0001.xml:a attempt to override final parameter: mapred.job.tracker; Ignoring.
12/05/22 14:28:39 WARN mapred.LocalJobRunner: job_local_0001
java.net.ConnectException: Call to master/127.0.0.1:9001 failed on connection exception: java.net.ConnectException: Connection refused
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
at org.apache.hadoop.ipc.Client.call(Client.java:1071)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
at $Proxy1.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
at org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.<init>(ReduceTask.java:446)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:490)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
at org.apache.hadoop.ipc.Client.call(Client.java:1046)
... 17 more
12/05/22 14:28:39 WARN conf.Configuration: file:/tmp/hadoop-ema/mapred/local/localRunner/job_local_0001.xml:a attempt to override final parameter: fs.default.name; Ignoring.
12/05/22 14:28:39 WARN conf.Configuration: file:/tmp/hadoop-ema/mapred/local/localRunner/job_local_0001.xml:a attempt to override final parameter: mapred.job.tracker; Ignoring.
12/05/22 14:28:39 INFO mapred.JobClient: Job complete: job_local_0001
12/05/22 14:28:39 INFO mapred.JobClient: Counters: 20
12/05/22 14:28:39 INFO mapred.JobClient: File Input Format Counters
12/05/22 14:28:39 INFO mapred.JobClient: Bytes Read=967
12/05/22 14:28:39 INFO mapred.JobClient: FileSystemCounters
12/05/22 14:28:39 INFO mapred.JobClient: FILE_BYTES_READ=14093
12/05/22 14:28:39 INFO mapred.JobClient: FILE_BYTES_WRITTEN=47859
12/05/22 14:28:39 INFO mapred.JobClient: Map-Reduce Framework
12/05/22 14:28:39 INFO mapred.JobClient: Map output materialized bytes=1960
12/05/22 14:28:39 INFO mapred.JobClient: Map input records=10
12/05/22 14:28:39 INFO mapred.JobClient: Reduce shuffle bytes=0
12/05/22 14:28:39 INFO mapred.JobClient: Spilled Records=10
12/05/22 14:28:39 INFO mapred.JobClient: Map output bytes=1934
12/05/22 14:28:39 INFO mapred.JobClient: Total committed heap usage (bytes)=115937280
12/05/22 14:28:39 INFO mapred.JobClient: CPU time spent (ms)=0
12/05/22 14:28:39 INFO mapred.JobClient: Map input bytes=967
12/05/22 14:28:39 INFO mapred.JobClient: SPLIT_RAW_BYTES=82
12/05/22 14:28:39 INFO mapred.JobClient: Combine input records=0
12/05/22 14:28:39 INFO mapred.JobClient: Reduce input records=0
12/05/22 14:28:39 INFO mapred.JobClient: Reduce input groups=0
12/05/22 14:28:39 INFO mapred.JobClient: Combine output records=0
12/05/22 14:28:39 INFO mapred.JobClient: Physical memory (bytes) snapshot=0
12/05/22 14:28:39 INFO mapred.JobClient: Reduce output records=0
12/05/22 14:28:39 INFO mapred.JobClient: Virtual memory (bytes) snapshot=0
12/05/22 14:28:39 INFO mapred.JobClient: Map output records=10
12/05/22 14:28:39 INFO mapred.JobClient: Job Failed: NA
java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1265)
at uni.kassel.macek.rtprep.RetweetApplication.main(RetweetApplication.java:50)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
由于这个堆栈跟踪现在告诉我,在执行期间使用了端口9001,我想不知怎的,xml配置文件会覆盖local-java-made设置(我用于测试),这很奇怪,因为我读了在互联网上一遍又一遍,java覆盖xml配置。如果没有人知道如何纠正这个问题,那就试着简单地擦除所有配置xmls。也许这解决了这个问题......
NEW UPDATE
重命名Hadoops conf
文件夹解决了等待复印机的问题,程序执行到最后。遗憾的是,虽然HADOOP_OPTS
设置正确,但执行程序不再等待我的调试程序。
简历:它只是一个配置问题:XML可能(对于某些配置参数)覆盖JAVA。如果有人知道如何让调试再次运行,那将是完美的,但是现在我很高兴我再也看不到这个堆栈跟踪了! ;)
谢谢克里斯的时间和精力!
答案 0 :(得分:2)
很抱歉我之前没有看到这个,但您似乎在conf xml文件中将两个重要的配置属性设置为final,如以下日志语句所示:
12/05/22 14:28:29 WARN conf.Configuration: file:/tmp/hadoop-ema/mapred/local/localRunner/job_local_0001.xml:a attempt to override final parameter: fs.default.name; Ignoring. 12/05/22 14:28:29 WARN conf.Configuration: file:/tmp/hadoop-ema/mapred/local/localRunner/job_local_0001.xml:a attempt to override final parameter: mapred.job.tracker; Ignoring.
这意味着您的作业无法实际以本地模式运行,它以本地模式启动,但是reducer读取序列化作业配置并确定它不处于本地模式,并尝试通过任务跟踪器获取地图输出端口。
你说你的修复是重命名conf文件夹 - 这将默认hadoop回到默认配置,其中这两个属性没有标记为'final'