I have two Map/Reduce classes, named MyMappper1/MyReducer1 and MyMapper2/MyReducer2, and want to use the output of MyReducer1 as the input of MyMapper2, by setting the input path of job2 to the output path of job1.
类型如下:
public class MyMapper1 extends Mapper<LongWritable, Text, IntWritable, IntArrayWritable>
public class MyReducer1 extends Reducer<IntWritable, IntArrayWritable, IntWritable, IntArrayWritable>
public class MyMapper2 extends Mapper<IntWritable, IntArrayWritable, IntWritable, IntArrayWritable>
public class MyReducer2 extends Reducer<IntWritable, IntArrayWritable, IntWritable, IntWritable>
public class IntArrayWritable extends ArrayWritable {
public IntArrayWritable() {
super(IntWritable.class);
}
}
设置输入/输出路径的代码如下:
Path temppath = new Path("temp-dir-" + temp_time);
FileOutputFormat.setOutputPath(job1, temppath);
...........
FileInputFormat.addInputPath(job2, temppath);
设置输入/输出格式的代码如下:
job1.setOutputFormatClass(TextOutputFormat.class);
..........
job2.setInputFormatClass(KeyValueTextInputFormat.class);
但是我在运行job2时总是遇到异常:
11/04/16 12:34:09 WARN mapred.LocalJobRunner: job_local_0002
java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.hadoop.io.IntWritable
at ligon.MyMapper2.map(MyMapper2.java:1)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:646)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:210)
我尝试更改InputFormat和OutputFormat,但没有成功,在job2中发生了类似的(虽然不一样)异常。
我的完整代码包位于: http://dl.dropbox.com/u/7361939/HW2_Q1.zip
非常感谢!
答案 0 :(得分:0)
问题在于,在作业2中,KeyValueTextInputFormat生成类型的键值对,并且您尝试使用接受的Mapper处理它们,从而导致ClassCastException。最好的办法是将映射器更改为接受并从Text转换为整数。
答案 1 :(得分:0)
我遇到了同样的问题,并在不久前找到了解决方案。由于您使用IntArrayWritable作为reducer的输出,因此易于编写,稍后将数据读取为二进制文件。
第一份工作:
job1.setOutputFormatClass(SequenceFileOutputFormat.class);
job1.setOutputKeyClass(IntWritable.class);
job1.setOutputValueClass(IntArrayWritable.class);
第二份工作:
job2.setInputFormatClass(SequenceFileInputFormat.class);
这适用于您的情况