我正在创建一个简单的hadoop排序示例,我有以下代码。
我正在使用现成的InverseMaper和Identity reducer
FileInputFormat.setInputPaths(conf, new Path(args[0]));
FileOutputFormat.setOutputPath(conf, new Path(args[1]));
conf.setInputFormat(TextInputFormat.class);
conf.setOutputKeyClass(LongWritable.class);
conf.setOutputValueClass(LongWritable.class);
conf.setMapOutputKeyClass(LongWritable.class);
conf.setMapOutputValueClass(LongWritable.class);
conf.setMapperClass(InverseMapper.class);
conf.setReducerClass(IdentityReducer.class);
conf.setNumReduceTasks(1);
这是我输入的文件数据 -
432
532
5234
43
65
524
15
56
96
25
3251
369845
58
249
354
当我运行此代码时,我收到以下错误。有人可以帮忙吗?
15/10/28 15:25:09 INFO mapred.LocalJobRunner: map task executor complete.
15/10/28 15:25:09 WARN mapred.LocalJobRunner: job_local2001686703_0001
java.lang.Exception: java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.LongWritable, received org.apache.hadoop.io.Text
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.LongWritable, received org.apache.hadoop.io.Text
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1069)
at org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:607)
at org.apache.hadoop.mapred.lib.InverseMapper.map(InverseMapper.java:42)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
答案 0 :(得分:1)
我认为问题在于 TextInputFormat扩展了FileInputFormat< \ LongWritable,Text> ,因此映射器没有以正确的格式读取。
来自文档:
纯文本文件的InputFormat。文件分为几行。换行或回车用于发出行尾信号。键是文件中的位置,值是文本行。