Question

我正在尝试为倒排索引计算编写一个map reduce程序。

我的地图代码是

public class InvertdIdxMapper extends Mapper<LongWritable, Text, Text, Text> {

public void map(LongWritable ikey, Text ivalue, Context context,Reporter reporter)
        throws IOException, InterruptedException {

    Text word=new Text();
     Text location=new Text();

    FileSplit filespilt=(FileSplit)reporter.getInputSplit();
    String fileName=filespilt.getPath().getName();
    location.set(fileName);

    String line=ivalue.toString();
    StringTokenizer itr=new StringTokenizer(line.toLowerCase());
    while (itr.hasMoreTokens()){
        word.set(itr.nextToken());
        //System.out.println("Key is "+ word + "value is "+location);
        context.write(word, location);
            }
    }
}

我的记录器代码是

public class InvertedIdxReducer extends Reducer<Text, Text, Text, Text> {

public void reduce(Text _key, Iterable<Text> values, Context context)
        throws IOException, InterruptedException {


    boolean first=true;
    StringBuilder toReturn=new StringBuilder();
    // process valuess
    Iterator<Text> itr =values.iterator();
    while(itr.hasNext()){
        if(!first)
            toReturn.append(", ");
        first=false;
        toReturn.append(itr.next().toString());

    }
    context.write(_key,new Text(toReturn.toString()));
   }
}

和驱动程序代码

public class InvertedIdxDriver {

public static void main(String[] args) throws Exception {
    Configuration conf = new Configuration();
    Job job = Job.getInstance(conf, "JobName");
    job.setJarByClass(InvertedIdxDriver.class);
    // TODO: specify a mapper
    job.setMapperClass(InvertdIdxMapper.class);
    // TODO: specify a reducer
    job.setReducerClass(InvertedIdxReducer.class);

    // TODO: specify output types
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(Text.class);
    /////
    job.setMapOutputKeyClass(Text.class);
    job.setMapOutputValueClass(Text.class);

    // TODO: specify input and output DIRECTORIES (not files)
    FileInputFormat.setInputPaths(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));

    if (!job.waitForCompletion(true))
            return;
    }

}

当我运行上面的代码然后我得到以下错误

 15/08/18 13:27:04 INFO mapreduce.Job: Task Id : attempt_1439870445298_0019_m_000000_2, Status : FAILED
Error: java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.Text, received org.apache.hadoop.io.LongWritable
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1069)
    at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:712)
    at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
    at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
    at org.apache.hadoop.mapreduce.Mapper.map(Mapper.java:124)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

此程序的输入是简单的文本文件，几行。我跟踪了this和this帖子，但我的问题仍然存在。我错过了一些关于map-reduce编程的重要注意事项吗？

请建议..

谢谢

Answer 1

我认为您没有正确覆盖map方法，因此调用了默认的map方法，这就是您收到错误的原因。检查map方法的签名是否正确。我相信它应该是这样的：

protected void map(LongWritable iKey, Text iValue, Context context) throws IOException, InterruptedException

此外，您还需要更换此行：

FileSplit filespilt=(FileSplit)reporter.getInputSplit();

使用：

FileSplit filespilt=(FileSplit)context.getInputSplit();

Hadoop：错误：java.io.IOException：键入map中的键不匹配：expected org.apache.hadoop.io.Text，收到org.apache.hadoop.io.LongWritable

1 个答案: