在我的hadoop编程中,我想为reducer提供自定义输出名称,让我们说这里是代码片段
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Partitioner;
public class Partitionclass extends Partitioner<Text, IntWritable>{
@Override
public int getPartition(Text key, IntWritable value, int numreducetasks){
// TODO Auto-generated method stub
Job job=null;
Configuration conf=new Configuration();
try {
job = Job.getInstance(conf, "word count");
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
if(numreducetasks==2)
{
String partkey=key.toString();
int val=Integer.parseInt(partkey);
if(val%2==0)
{
//System.out.println("Even"+val);
job.getConfiguration().set("mapreduce.output.basename", "Even");
return 0;
}
else
{
job.getConfiguration().set("mapreduce.output.basename", "Odd");
return 1;
}
}
else if(numreducetasks==1)
return 0;
else
System.out.println("Please give reduce task at least one");
return -1;
}
}
我通过驱动程序类尝试了它,但它工作,所以我在分区类中创建Job但仍然无法正常工作。我想要输出文件名称如For Odd odd-r-00000和Even,Even-r-00001。任何人都可以告诉我我该怎么做。
答案 0 :(得分:0)
它是在hadoop doc中提供的。在这里 - http://hadoop.apache.org/docs/r2.4.1/api/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.html
按如下方式使用。
reduceMethod(){
.
.
.
.
multipleOutputs.write(key, value, generateFileName(key,value));
}
String generateFileName(Text key, IntWritable value){
return key.toString() + "_" + value.toString();
}