mapreduce的Reducer函数中的ArrayIndexOutOfBoundException

时间:2018-10-11 07:08:42

标签: hadoop mapreduce

当我删除该错误时,我不明白该错误是什么  job.setSortComparatorClass(LongWritable.DecreasingComparator.class);

我得到了输出,但是当我尝试使用它时,却遇到了这个异常。

我试图根据值从减速器中以递减的顺序获取输出,因此我使用了setsortcomparator类,所以请帮帮我

package topten.mostviewed.movies;

import java.io.IOException;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;

public class MostViewdReducer extends Reducer<Text,IntWritable,Text,LongWritable>
{
    public void reduce(Text key,Iterable<IntWritable> values,Context context) throws IOException,InterruptedException
    {
        int sum = 0;
        for(IntWritable value:values)
        {
            sum = sum+1;
        }
        context.write(key, new LongWritable(sum));
    }
}
package topten.mostviewed.movies;

import java.io.IOException;
import java.util.StringTokenizer;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.RawComparator;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;

public class MostViewdDriver 
{

   // @SuppressWarnings("unchecked")
    public static void main(String[] args) throws Exception
    {
        Configuration conf = new Configuration();
        String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
        if (otherArgs.length != 2)
        {
            System.err.println("Usage: movie <input> <out>");
            System.exit(2);
        }
    Job job = new Job(conf, "Movie ");
    job.setJarByClass(MostViewdDriver.class);
    job.setMapperClass(MostviewdMapper.class);
    job.setReducerClass(MostViewdReducer.class);

    job.setMapOutputKeyClass(Text.class);
    job.setMapOutputValueClass(IntWritable.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(LongWritable.class);
    job.setSortComparatorClass(LongWritable.DecreasingComparator.class);
//  job.setSortComparatorClass((Class<? extends RawComparator>) LongWritable.class);

    FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
    FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
    System.exit(job.waitForCompletion(true) ? 0 : 1);
    }
}

我得到的异常如下:

  

18/10/11 11:35:05 INFO mapreduce.Job:任务ID:try_1539236679371_0004_r_000000_2,状态:FAILED       错误:java.lang.ArrayIndexOutOfBoundsException:7               在org.apache.hadoop.io.WritableComparator.readInt(WritableComparator.java:212)               在org.apache.hadoop.io.WritableComparator.readLong(WritableComparator.java:226)               在org.apache.hadoop.io.LongWritable $ Comparator.compare(LongWritable.java:91)               在org.apache.hadoop.io.LongWritable $ DecreasingComparator.compare(LongWritable.java:106)               在org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:158)               在org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)               在org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer $ Context.nextKey(WrappedReducer.java:307)               在org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)               在org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)               在org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)               在org.apache.hadoop.mapred.YarnChild $ 2.run(YarnChild.java:168)               在java.security.AccessController.doPrivileged(本机方法)               在javax.security.auth.Subject.doAs(Subject.java:422)               在org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)              在org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

1 个答案:

答案 0 :(得分:1)

您的地图输出键是整数,但是您尝试使用用于long的比较器。将LongWritable.DecreasingComparator.class替换为IntWritable.DecreasingComparator.class