我有如下所述的地图缩减链。
Job1(Map1 - > Reduce 1) - > Job2(Map2,Reduce2)Job1.waitForCompletion(true)
我在Map2中需要一个值(假设为int,由Reduce 1创建)。
我该怎么做?请分享您的想法
答案 0 :(得分:1)
您可以使用ChainMapper和ChainReducer。以下是您的帮助示例代码。
Configuration conf = getConf();
JobConf job = new JobConf(conf);
JobConf Conf1 = new JobConf(false);
ChainMapper.setMapper
(job,
Map1.class,
LongWritable.class,
Text.class,
Text.class,
Text.class,
true,
Conf1);
JobConf Conf2 = new JobConf(false);
ChainReducer.setReducer
(job,
Reduce1.class,
Text.class,
Text.class,
Text.class,
Text.class,
true,
Conf2);
JobConf Conf3 = new JobConf(false);
ChainMapper.setMapper
(job,
Map2.class,
Text.class,
Text.class,
Text.class,
Text.class,
true,
Conf3);
JobConf Conf4 = new JobConf(false);
ChainReducer.setReducer
(job,
Reduce2.class,
Text.class,
Text.class,
Text.class,
Text.class,
true,
Conf4);
注意:强>
the out-put Type of key-value derive which Mapper and reducer is to be called next so , the output Type of Map1 should me same as Input Type of key-value of Reduce1 AND the output Type of Reduce1 should me same as Input Type of key-value of Map2 and
the output Type of Map2 should me same as Input Type of key-value of Reduce2
答案 1 :(得分:0)
您可以使用Job1中的计数器,Reduce1从Job1获取值,然后将其传递给Job2。这是需要编码的流程的示例代码。
1.使用计数器设置值的示例代码
Reducer()
{
public static enum COUNTER {
INTVALUE
};
Reduce()
{
// Old API
reporter.incrCounter(COUNTER .INTVALUE, 1);
//NEW API
context.getCounter(COUNTER .INTVALUE).increment(1);
}
}
2.从job1获取set计数器,然后将其设置为Job2的JonConf,mapper可以获得相同的值。
main()
{
// .....
jobclient1.submit(job1);
RunningJob job = JobClient1.runJob(conf); // blocks until job completes
Counters c = job.getCounters();
int value= c.getCounter(COUNTER .INTVALUE);
// Now set the value in Job2
Job job2 = new JobConf(conf);
job2.setInt("name", value);
}
3.Map2从Job1计数器获取值 - > Jobconf2
mapper()
{
int value;
@Override
public void configure(JobConf job) {
value=job.getInt("name", 0);
}
@Override
public void map(Text key, Text value,
OutputCollector<LongWritable, Text> output, Reporter arg3)
throws IOException {
}
}
答案 2 :(得分:0)
----------
将Reduce1的输出保存到flatfile(hdfs) 在第二个作业中设置驱动程序(作业)时读取该文件。然后在上下文中设置变量。
//read reducer output from file . and set it @name variable
Configuration conf = getConf();
Job job = new JobConf(conf);
conf.setInt("name", 0000);
在mapper(map2)
中 mapper()
{
int value;
@Override
public void configure(JobConf job) {
value=job.getInt("name", 0);
}
@Override
public void map(Text key, Text value,
OutputCollector<LongWritable, Text> output, Reporter arg3)
throws IOException {
}
}