将Reducer方法的Iterator <pojo>对象存储到数组</pojo>

时间:2014-12-24 08:42:01

标签: arrays hadoop mapreduce iterator

我想存储     迭代器值 这是Reduce方法的论据,即

public void reduce(IntWritable key, Iterator<Pojo> values,
        OutputCollector<IntWritable, SubArrayWritable> output, Reporter reporter)

在自定义ArrayWritable类对象中,即PojoArrayWritable对象。 我已经创建了PojoArrayWritable类,其代码是

import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;
import org.apache.hadoop.io.ArrayWritable;
import org.apache.hadoop.io.DoubleWritable;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Writable;
import org.apache.hadoop.io.WritableComparable;

public class SubArrayWritable extends ArrayWritable
{
public SubArrayWritable() {
    super(Sub.class);
}
public SubArrayWritable(Sub[] values) {
    super(Sub.class, values);
}
}

Sub.class具有以下代码 -

import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;
import org.apache.hadoop.io.DoubleWritable;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Writable;

public class Sub implements Writable {
private IntWritable id = new IntWritable();
private LongWritable pno = new LongWritable();
private DoubleWritable sal = new DoubleWritable();

public IntWritable getid() {
    return id;
}

public void setId(IntWritable id) {
    this.id = id;
}

public LongWritable getPno() {
    return pno;
}

public void setPno(LongWritable pno) {
    this.pno = pno;
}

public DoubleWritable getSal() {
    return sal;
}

public void setSal(DoubleWritable sal) {
    this.sal = sal;
}

@Override
public void readFields(DataInput in) throws IOException {
    id.readFields(in);
    pno.readFields(in);
    sal.readFields(in);

}

@Override
public void write(DataOutput out) throws IOException {
    id.write(out);
    pno.write(out);
    sal.write(out);
}


}

在Reducer类中

public void reduce(IntWritable key, Iterator<Pojo> values,
        OutputCollector<IntWritable, SubArrayWritable> output, Reporter reporter)

我想将Iterator的值对象存储到数组中,即存储到SubArrayWritable的对象,所以我怎么能这样做,因为我不知道Iterator的值对象的大小/长度,所以SubArrayWritable对象的长度是多少将创造??

所以最后基本上我有Iterator<Pojo> values,我必须把它转换为array of Pojo

1 个答案:

答案 0 :(得分:0)

我找到了这个解决方案 -

    Pojo pojo=new Pojo();
    List<Pojo> list=new ArrayList<Pojo>();

    if (values.hasNext()) {
        pojo = values.next();
        list.add(pojo);
       }

现在我可以使用这个包含迭代器对象

值的pojo对象值