Java hadoop错误的值类:class RatioCount $ WritableArray不是类org.apache.hadoop.io.DoubleWritable

时间:2016-10-30 10:17:58

标签: java hadoop mapreduce

我正在努力学习hadoop。我有一个文本文件,其中每行包含一个流量。信息以逗号分隔。我希望我的map函数输出一个字符串,我构建它来识别一个流程,如下所示:" 123.124.32.6 14.23.64.21 80 tcp"作为键和值有些加倍(一个数字)。我希望我的reduce函数输出与键相同的字符串,并作为一个值来获取所有类似键中的所有值并将它们放入数组中。所以我想要这样的东西: " 123.124.32.6 14.23.64.21 80 tcp":[0.3 -0.1 1 -1 0.5] 作为我的最终输出。 当我运行它时,我收到一个错误:

  

错误:java.io.IOException:错误的值类:class   RatioCount $ WritableArray不是类   org.apache.hadoop.io.DoubleWritable

你能指出我的错误以及如何解决它吗?

这是我的代码:

public class RatioCount {


public static class WritableArray extends ArrayWritable {

    public WritableArray(Class<? extends Writable> valueClass, Writable[] values) {
        super(valueClass, values);
    }
    public WritableArray(Class<? extends Writable> valueClass) {
        super(valueClass);
    }

    @Override
    public DoubleWritable[] get() {
        return (DoubleWritable[]) super.get();
    }

    @Override
    public void write(DataOutput arg0) throws IOException {
        System.out.println("write method called");
        super.write(arg0);
    }
    @Override
    public String toString() {
        return Arrays.toString(get());
    }

}



public static void main(String[] args) throws Exception {

    Configuration conf = new Configuration();

    Job job = Job.getInstance(conf, "ratio count");

    job.setJarByClass(RatioCount.class);
    job.setMapperClass(MyMapper.class);
    job.setCombinerClass(MyReducer.class);
    job.setReducerClass(MyReducer.class);
    job.setOutputKeyClass(Text.class);
    job.setMapOutputKeyClass(Text.class);
    job.setMapOutputValueClass(DoubleWritable.class);
    job.setOutputValueClass(WritableArray.class);
    FileInputFormat.addInputPath(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));
    System.exit(job.waitForCompletion(true) ? 0 : 1);
}



public static class MyReducer
        extends Reducer<Text, DoubleWritable, Text, WritableArray> {

    private final IntWritable result = new IntWritable();

    public void reduce(Text key, Iterable<DoubleWritable> values, Context context)
            throws IOException, InterruptedException {
        ArrayList<DoubleWritable> list = new ArrayList<DoubleWritable>();

        for(DoubleWritable value :values){
            list.add(value);
        }
        context.write(key, new WritableArray(DoubleWritable.class, list.toArray(new DoubleWritable[list.size()])));
    }


}




public static class MyMapper extends Mapper<Object, Text, Text, DoubleWritable> {

    private final Text word = new Text();

    public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
        if (value.toString().contains("StartTime")) {
            return;
        }
        DoubleWritable ratio;
StringTokenizer(value.toString(),",");
            String[] tokens = value.toString().split(",");
            StringBuilder sb = new StringBuilder();
            sb.append(tokens[2]);
            sb.append(tokens[3]);
            sb.append(tokens[6]);
            sb.append(tokens[7]);
            System.out.println(sb.toString());
            word.set(sb.toString());  
            double sappbytes = Double.parseDouble(tokens[13]);
            double totbytes = Double.parseDouble(tokens[14]);
            double dappbytes = totbytes - sappbytes;

            ratio = new DoubleWritable((sappbytes - dappbytes) / totbytes);
            context.write(word, ratio);

        }
    }
}

1 个答案:

答案 0 :(得分:2)

你的问题就在这一行:

job.setCombinerClass(MyReducer.class);

组合器必须接收并发出相同的类型。在你的情况下,你有:

Reducer<Text, DoubleWritable, Text, WritableArray>将输出WritableArray,但以下缩减期望为DoubleWritable

您应该删除合并器,或重新编写它(作为单独的类到您的reducer),以便它接收Text, DoubleWriteable并发出相同的类型。