我的Map
课程是
public static class MapClass extends Mapper<LongWritable, Text, Text, LongWritable> {
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
// your map code goes here
String[] fields = value.toString().split(",");
String year = fields[1];
String claims = fields[8];
if (claims.length() > 0 && (!claims.startsWith("\""))) {
context.write(new Text(year), new LongWritable(Long.parseLong(claims)));
}
}
}
我的Reduce
课程类似于
public static class Reduce extends Reducer<Text, LongWritable, Text, Text> {
public void reduce(Text key, Iterable<LongWritable> values, Context context) throws IOException, InterruptedException {
// your reduce function goes here
context.write(key, new Text("hello"));
}
}
数据集看起来像
3070801,1963,1096,,"BE","",,1,,269,6,69,,1,,0,,,,,,,
3070802,1963,1096,,"US","TX",,1,,2,6,63,,0,,,,,,,,,
当我使用配置运行程序时
Job job = new Job();
job.setJarByClass(TopKRecords.class);
job.setMapperClass(MapClass.class);
job.setReducerClass(Reduce.class);
FileInputFormat.setInputPaths(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.setJobName("TopKRecords");
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
我将错误视为
java.io.IOException: Type mismatch in value from map: expected org.apache.hadoop.io.Text, recieved org.apache.hadoop.io.LongWritable
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1019)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691)
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at com.hadoop.programs.TopKRecords$MapClass.map(TopKRecords.java:35)
at com.hadoop.programs.TopKRecords$MapClass.map(TopKRecords.java:26)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
这里出了什么问题?
我没有看到任何不匹配的原因
Mapper<LongWritable, Text, Text, LongWritable>
Reducer<Text, LongWritable, Text, Text>
更新
设置以下内容后,事情就开始起作用了
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(LongWritable.class);
答案 0 :(得分:1)
在设置过程中还需要以下一行:
job.setMapOutputValueClass(LongWritable.class);
来自Hadoop 20.2 Javadoc:
这允许用户指定地图输出值类 与最终输出值类不同。
为清楚起见,您还可以添加:
job.setMapOutputKeyClass(Text.class);
但在这种情况下没有必要。
答案 1 :(得分:0)
这显然不对吗?
context.write(new Text(year), new LongWritable(Long.parseLong(claims)));
并且你的映射器是
Mapper<LongWritable, Text, Text, LongWritable>
您已在此处交换了键和值类型。