我需要在HBase
表上运行MapReduce以求和Double
个值。我按照HBase documentation。
最初通过将字符串转换为字节数组(使用HBase的Bytes.toBytes(value)
)来存储属性,但现在我需要将它们作为Double
来对它们的值求和。对于仅具有正值的列,它给出了正确的总和,但是我的列也有一些负值(在下面的代码中称为diferenca
)。
当我运行这个工作时,它给了我一个错误的答案,我注意到它就像Reduce任务正在获取数字的模块,或类似的东西。但是当我调试时,我发现解析的Double对象valor
是正确的否定。我不知道造成这种情况的原因......
工作设置:
job = new Job(hTable.getConfiguration(), "All");
job.setJarByClass(HBaseQuery.class);
scan.setCaching(500);
scan.setCacheBlocks(false);
TableMapReduceUtil.initTableMapperJob(
tabela, // input table
scan,
CalculoTotaisMapper.class, // mapper class
Text.class, // mapper output key
DoubleWritable.class, // mapper output value
job);
TableMapReduceUtil.initTableReducerJob(
tabela, // output table
CalculoTotaisReducer.class, // reducer class
job);
job.setNumReduceTasks(1); // at least one, adjust as required
boolean b = job.waitForCompletion(true);
我的映射器:
public class CalculoTotaisMapper extends TableMapper<Text, DoubleWritable> {
public static final String[] ATTRS =
new String[]{"valortotalprestador", "valortotalconvenio", "diferenca"};
private Text text = new Text();
private Logger logger = LoggerFactory.getLogger(CalculoTotaisMapper.class);
public void map(ImmutableBytesWritable row, Result value, Context context)
throws IOException, InterruptedException {
Double val = 0.0;
for (String attr : ATTRS) {
byte[] valueBytes = value.getValue("hc".getBytes(), attr.getBytes());
String valueString = Bytes.toString(valueBytes);
val = Double.parseDouble(valueString);
text.set(attr);
context.write(text, new DoubleWritable(val));
}
}
}
我的减速机:
public class CalculoTotaisReducer extends TableReducer <Text, DoubleWritable,
ImmutableBytesWritable> {
public static final byte[] CF = "qu".getBytes();
public void reduce(Text key, Iterable<DoubleWritable> values,
Context context) throws IOException, InterruptedException {
double i = 0.0;
for (DoubleWritable val : values) {
double valor = val.get();
i += valor;
}
Put put = new Put(Bytes.toBytes("all"));
put.add(CF, key.getBytes(), Bytes.toBytes(i));
context.write(null, put);
}
}