自定义对象作为Mapper输出的值

时间:2015-08-24 09:10:00

标签: hadoop

我的对象构造如下:

Class ObjExample {
     String s;
     Object[] objArray; // element in this array can be primitive type or array of primitive type.
}

我知道要将它用作mapper或reducer的输出类型,我们必须为它实现WritableComparable。

但我真的很困惑如何为这种类编写readFields(),write(),compareTo()?

1 个答案:

答案 0 :(得分:1)

您可以将s中的字段TextobjArray中的ArrayWritable包裹起来。 objArray的每个元素都是基元的数组(也是ArrayWritable)。这可能是实施:

public static final class ObjExample implements WritableComparable<ObjExample> {
    public final Text s = new Text(); // wrapped String
    public final ArrayOfArrays objArray = new ArrayOfArrays();

    @Override
    public int compareTo(ObjExample o) {
        // your logic here, example:
        return s.compareTo(o.s);
    }

    @Override
    public void write(DataOutput dataOutput) throws IOException {
        s.write(dataOutput);
        objArray.write(dataOutput);
    }

    @Override
    public void readFields(DataInput dataInput) throws IOException {
        s.readFields(dataInput);
        objArray.readFields(dataInput);
    }

    // set size of the objArray
    public void setSize(int n) {
        objArray.set(new IntArray[n]);
    }

    // set i-th element of the objArray to an array of elements
    public void setElement(int i, IntWritable... elements) {
        IntArray subArr = new IntArray();
        subArr.set(elements);
        objArray.get()[i] = subArr;
    }
}

您需要再安排两个课程才能使其发挥作用:

// array of primitives
public static final class IntArray extends ArrayWritable {
    public IntArray() {
        // you can specify any other primitive wrapper (DoubleWritable, Text, ...)
        super(IntWritable.class);
    }
}

// array of arrays
public static final class ArrayOfArrays extends ArrayWritable {
    public ArrayOfArrays() {
        super(IntArray.class);
    }
}

对象的构造示例:

ObjExample o = new ObjExample();
o.s.set("hello");
o.setSize(2);
o.setElement(0, new IntWritable(0)); // single primitive
o.setElement(1, new IntWritable(1), new IntWritable(2)); // array of primitives