避免将part-r-00 *****附加到MapReduce作业输出文件的末尾

时间:2015-12-11 09:22:48

标签: hadoop mapreduce multipleoutputs

我正在使用Multioutputformat类运行MR代码。 part ****将附加在输出文件的末尾。我怎么能避免这种情况?

公共类MR_reducer扩展         减速器{

Msg 22048, Level 15, State 0, Line 0
Error executing extended stored procedure: Invalid Parameter

}

1 个答案:

答案 0 :(得分:0)

此代码段正在向我提供。你的差异很小:

public static class Reduce extends Reducer<Text, Text, NullWritable, Text> {

    private MultipleOutputs<NullWritable, Text> multipleOutputs;

    protected void setup(Context context) throws IOException, InterruptedException {
        multipleOutputs = new MultipleOutputs<NullWritable, Text>(context);

    }

    public void reduce(Text key, Iterable<Text> values, Context output) throws IOException, InterruptedException {
        while (values.iterator().hasNext()) {
            multipleOutputs.write(NullWritable.get(), values.iterator().next(), key.toString());
        }
    }

    protected void cleanup(Context context) throws IOException, InterruptedException {
        multipleOutputs.close();
    }
}