在hadoop中具有复杂类型的ReadField

时间:2013-06-03 16:03:25

标签: hadoop mapreduce

我有这个班级:

public class Stripe implements WritableComparable<Stripe>{
    private List<Term> occorrenze = new ArrayList<Term>();

    public Stripe(){}

    @Override
    public void readFields(DataInput in) throws IOException {

    }
}


public class Term implements WritableComparable<Term> {

    private Text key;
    private IntWritable frequency;

    @Override
    public void readFields(DataInput in) throws IOException {
        this.key.readFields(in);
        this.frequency.readFields(in);
    }

Stripe是Term的一个列表(Text和intWritable对)。 如何设置方法“readField”以从DataInput读取复杂类型Stripe?

2 个答案:

答案 0 :(得分:1)

要序列化列表,您需要写出列表的长度,然后是元素本身。 Stripe的简单readFields / write方法对可以是:

@Override
public void readFields(DataInput in) throws IOException {
    occorrenze.clear();
    int cnt = in.readInt();
    for (int x = 0; x < cnt; x++) {
        Term term = new Term();
        term.readFields(in);
        occorrence.add(term);
    }
}

@Override
public void write(DataOutput out) throws IOException {
    out.writeInt(occorrenze.size());
    for (Term term : occorrenze) {
        term.write(out);
    }
}

通过使用VInt而不是int,并使用可以重复使用的术语池来保存readFields方法中的对象创建/垃圾回收

答案 1 :(得分:0)

您可以使用ArrayWritable,它是相同类型的可写入列表。