我已经定义了2个映射器来单独处理2个文件。
第一张映射器的输出 - 第二个映射器的输出 -
在驱动程序代码中,我使用MultipleInput类的addInputPath()添加两个映射器。
在运行jar时,我收到类型不匹配错误。
16/04/24 18:40:28 INFO mapreduce.Job: Task Id : attempt_1461435780053_0008_m_000001_0, Status : FAILED
Error: java.io.IOException: Type mismatch in value from map: expected hadoop.StationObj, received org.apache.hadoop.io.Text
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1077)
以下是代码
public static class customerMapper extends Mapper<LongWritable,Text,IntWritable,StationObj>
{
IntWritable outkey=new IntWritable();
StationObj outvalue=new StationObj();
//2,Russia,Jhonson,10000
public void map(LongWritable key,Text values,Context context) throws IOException, InterruptedException
{
String []cols=values.toString().split(",");
outkey.set(Integer.parseInt(cols[0]));
outvalue.setAmount(Integer.parseInt(cols[3]));
outvalue.setCountry(cols[1]);
outvalue.setProduct(cols[2]);
context.write(outkey, outvalue);
}
}
public static class countryMapper extends Mapper<LongWritable,Text,IntWritable,Text>
{
IntWritable outkey=new IntWritable();
Text outvalue=new Text();
public void map(LongWritable key,Text values,Context context) throws IOException, InterruptedException
{
String []cols=values.toString().split(",");
outkey.set(Integer.parseInt(cols[0]));
outvalue.set(cols[1]);
context.write(outkey,outvalue);
}
}
public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
Configuration conf=new Configuration();
Job job=new Job(conf,"dsddd");
job.setJarByClass(stationRedJoin.class);
job.setMapOutputKeyClass(IntWritable.class);
//job.setMaxMapAttempts(1);
MultipleInputs.addInputPath(job, new Path(args[0]), TextInputFormat.class, customerMapper.class);
MultipleInputs.addInputPath(job, new Path(args[1]), TextInputFormat.class, countryMapper.class);
FileOutputFormat.setOutputPath(job, new Path(args[2]));
System.exit(job.waitForCompletion(true)?1:0);
}
}
答案 0 :(得分:0)
最好在驱动程序类中为mapper和reducer(如果有)传递所有类型,例如:
//output format for mapper
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class); <---
//output format for reducer (if)
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
//use MultipleInputs and specify different Record class and Input formats
MultipleInputs.addInputPath(job, fPath, FirstInputFormat.class, MyFirstMap.class);
MultipleInputs.addInputPath(job, sPath, SecondInputFormat.class, MySecondMap.class);