Map Reduce的对象序列化

时间:2013-12-12 18:12:02

标签: hadoop mapreduce

你好朋友,

我正在尝试序列化对象,它可以从mapper传递给reducer作为out值。在这个程序中,我得到了异常

java.lang.RuntimeException: java.lang.NoSuchMethodException: com.test.objectpass.SerObj.<init>()
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:115)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:62)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
at org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116)
at org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260)

Caused by: java.lang.NoSuchMethodException: com.test.objectpass.SerObj.<init>()
at java.lang.Class.getConstructor0(Class.java:2715)
at java.lang.Class.getDeclaredConstructor(Class.java:1987)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:109)

我正在复制需要序列化的对象类的代码 -

package com.test.objectpass;

import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;

import org.apache.hadoop.io.WritableComparable;

public class SerObj implements WritableComparable<Object> {

    private String name = null;
    private String surname = null;
    private Long number = null;

    public SerObj(String name, String surname, Long number) {
        super();
        setName(name);
        setNumber(number);
        setSurname(surname);
    }
    @Override
    public final String toString() {
        final StringBuilder string = new StringBuilder();
        string.append("SerObj [name=").append(name).append(", surname=")
                .append(surname).append(", number=").append(number).append("]");
        return string.toString();
    }

    public final String getName() { return name; }
    public final void setName(String name) { this.name = name; }
    public final String getSurname() { return surname; }
    public final void setSurname(String surname) { this.surname = surname; }
    public final Long getNumber() { return number; }
    public final void setNumber(Long number) { this.number = number; }

    @Override
    public void readFields(DataInput in) throws IOException {
        name =in.readLine();
        surname = in.readLine();
        number = in.readLong();
    }

    @Override
    public void write(DataOutput out) throws IOException {
        out.writeBytes(name);
        out.writeBytes(surname);
        out.writeLong(number);
    }

    @Override
    public boolean equals(Object o) {
        if (!(o instanceof SerObj))
            return false;
        SerObj other = (SerObj) o;
        return this.number == other.number;
    }

    @Override
    public int compareTo(Object o) {
        long thisValue = this.number;
        long thatValue = ((SerObj)o).number;
        return (thisValue<thatValue ? -1 : (thisValue==thatValue ? 0 : 1));
    }
}

下面的代码是我提交作业的地图缩减驱动器 -

package com.test.objectpass;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.NullOutputFormat;

public class ObjectSerialization {
    public static class MyMapper extends Mapper<LongWritable, Text, Text, SerObj> {
        @Override
        public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
            String[] values = value.toString().split(" ");
            SerObj obj = new SerObj(values[0], values[1], Long.parseLong(values[2]));
            context.write(new Text(values[0]), obj);
        }
    }

    public static class MyReducer extends Reducer<Text, SerObj, NullWritable, NullWritable> {
        @Override
        public void reduce(Text key, Iterable<SerObj> values, Context context) throws IOException, InterruptedException {
            for (SerObj valueObj : values) {
                System.out.println(valueObj);
            }
        }
    }

    public static void main(String[] args) throws Exception {

        final Configuration conf = new Configuration();
        Job job = new Job(conf, "TEST");
        job.setJarByClass(ObjectSerialization.class);
        job.setMapperClass(MyMapper.class);
        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(SerObj.class);
        job.setReducerClass(MyReducer.class);
        job.setOutputFormatClass(NullOutputFormat.class);

        TextInputFormat.addInputPath(job, new Path("/home/pankaj/test"));

        job.waitForCompletion(true);
        System.out.println("Done.");
    }
}

请帮我解决这个问题。

提前致谢。

1 个答案:

答案 0 :(得分:2)

错误信息应该足够清楚;您的SerObj类中没有0-arg构造函数。现在几乎每个序列化框架都要求你的bean没有可用的arg构造函数,这样框架可以在通过线程读取所有数据之前通过反射实例化,而Writable序列化也没有什么不同。