我正在使用oozie工作流运行此代码并导致类型不匹配错误:
public static class mapClass extends Mapper<Object, Text, LongWritable, LongWritable> {
public void map(Object, Text, Context..)
..
context.write(<LongWritable type> , <LongWritable type> )
}
public static class reduceClass extends Reducer<LongWritable, LongWritable,LongWritable, LongWritable> {
public void reduce(LongWritable, LongWritable, context)
..
context.write(<LongWritable type>, <LongWritable type>)
{
}
java.io.IOException:输入map中的值不匹配:expected org.apache.hadoop.io.LongWritable,recieved org.apache.hadoop.io.Text
我在我的工作流程中使用new-api。相同的代码在不使用oozie的情况下工作正常。
任何帮助将不胜感激。感谢。
-----代码示例---
package org.apache.hadoop;
import java.io.IOException;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
public class MapperLong extends Mapper<LongWritable, Text, LongWritable, LongWritable> {
public final static int COL_ZERO = 0;
public final static int COL_ONE = 1;
public final static int COL_TWO = 2;
public final static int COL_THREE = 3;
@Override
public void map(LongWritable offset, Text line, Context context)
throws IOException, InterruptedException {
String[] parts = (line.toString()).split(" ");
LongWritable one = new LongWritable(Integer.parseInt(parts[COL_ONE]));
LongWritable two = new LongWritable(Integer.parseInt(parts[COL_TWO]));
context.write(one, two);
}
}
package org.apache.hadoop;
import java.io.IOException;
import java.util.HashSet;
import java.util.Set;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.mapreduce.Reducer;
public class ReducerLong extends Reducer<LongWritable, LongWritable, LongWritable, LongWritable> {
@Override
public void reduce(LongWritable colOneKey, Iterable<LongWritable> values,
Context context) throws IOException, InterruptedException{
Set<Integer> colTwo = new HashSet<Integer>();
for (LongWritable val : values) {
colTwo.add(Integer.valueOf((int)val.get()));
}
context.write(colOneKey, new LongWritable(colTwo.size()));
}
}
}
java.io.IOException:输入map中的值不匹配:expected org.apache.hadoop.io.LongWritable,recieved org.apache.hadoop.io.Text at org.apache.hadoop.mapred.MapTask $ MapOutputBuffer.collect(MapTask.java:876) at org.apache.hadoop.mapred.MapTask $ NewOutputCollector.write(MapTask.java:574) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) 在org.apache.hadoop.mapreduce.Mapper.map(Mapper.java:124) 在org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) 在org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647) 在org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) 在org.apache.hadoop.mapred.Child $ 4.run(Child.java:270) at java.security.AccessController.doPrivileged(Native Method) 在javax.security.auth.Subject.doAs(Subject.java:396) 在org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) 在org.apache.hadoop.mapred.Child.main(Child.java:264)
Input :
34 342 1 1
45 23 0 1
..
..
注意:我将对象类型更改为LongWritable,这没有任何区别。在workflow.xml中使用以下属性时抛出上述异常。如果没有以下属性,代码将执行生成与带有offset!前缀的输入相同的输出。
<property>
<name>mapred.output.key.class</name>
<value>org.apache.hadoop.io.LongWritable</value>
</property>
<property>
<name>mapred.output.value.class</name>
<value>org.apache.hadoop.io.LongWritable</value>
</property>
答案 0 :(得分:2)
好的,我想通了。问题出在我定义的oozie工作流程中。
I had
<name>mapreduce.mapper.class</name>
and
<name>mapreduce.reducer.class</name>
instead of
<name>mapreduce.map.class</name>
and
<name>mapreduce.reduce.class</name>
[参考https://github.com/yahoo/oozie/wiki/Oozie-WF-use-cases MR Api]由于某些原因无法吸引我的眼睛:-(因为我使用的是非api工作流程的修改版本!
谢谢大家的时间。
答案 1 :(得分:1)
您很可能使用reducer作为合并器,这意味着它在地图的上下文中运行。在此处查看类似问题Wrong key class: Text is not IntWritable