Java MapReduce - 如何从Reducer类

时间:2015-04-26 03:29:05

标签: java hadoop mapreduce

我在编写前10(密钥,值)对输出的reducer代码时遇到了困难。

我当前的输出格式为((年,市场),总金额)。我想要的是每年的前10名总金额。我目前的代码是每年为每个市场输出每笔金额。

任何建议都将不胜感激!

映射器:

public class FundingMapper extends Mapper<LongWritable, Text, Text, IntWritable> {

private Text Year = new Text();
private Text Market = new Text();

public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {

    String line = value.toString();
    CSVReader reader = new CSVReader(new StringReader(line));

    String[] array = reader.readNext();
    reader.close();

    Year.set(array[14]);
    Market.set(array[3]);

    String amountString = array[15].replaceAll("[^0-9]","");
    int amount = 0;

    try {
        amount = Integer.parseInt(amountString);
    }

    catch(NumberFormatException nfe) {
        return;
    }

    IntWritable intW = new IntWritable(amount);

    String S = new StringBuilder().append(Year + " ").append(Market + " ").toString();

    context.write(new Text(S), intW);
}
}

减速机:

public class FundingReducer extends Reducer<Text, IntWritable, Text, IntWritable> {

public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, 
        InterruptedException {

    int sum = 0;

    for(IntWritable value : values) {
        sum += value.get();
    }

    context.write(key, new IntWritable(sum));
}
}

数据样本:

/organization/contravir-pharmaceuticals ContraVir Pharmaceuticals   |Biotechnology| Biotechnology   USA NY  New York City   New York    /funding-round/9a7cc724deba554585e2b79c14605866 post_ipo_equity     8/22/14 2014-08      2014-Q3    2014    4,742,648

/organization/contravir-pharmaceuticals ContraVir Pharmaceuticals   |Biotechnology| Biotechnology   USA NY  New York City   New York    /funding-round/04a7ec54417a0f9a6c99cf8db2eac819 venture A   10/15/14    2014-10  2014-Q4    2014    9,000,000    

/organization/contravir-pharmaceuticals ContraVir Pharmaceuticals   |Biotechnology| Biotechnology   USA NY  New York City   New York    /funding-round/328384053df3a992ca6d5da55ca0420e venture     2/14/14 2014-02  2014-Q1    2014    3,225,000    

/organization/contrib-com   contrib.com |Entrepreneur|Technology|Domains|Education|Social Media|    Social Media    USA FL  Palm Beaches    Delray Beach    /funding-round/fea112ed22657c1456820aa26af3ab17 seed        6/17/14 2014-06  2014-Q2    2014    300,000    

输出样本:

2014  Biotechnology  16967648
2014  Social Media  300000

1 个答案:

答案 0 :(得分:0)

您需要在地图输出中将关键字作为年份。这将确保您在减速器中一次获得每年的值。之后您可以只为输出过滤掉10个值。请查看下面的内容。

 public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {

        String line = value.toString();
        CSVReader reader = new CSVReader(new StringReader(line));

        String[] array = reader.readNext();
        reader.close();

        Year.set(array[14]);
        Market.set(array[3]);

        String amountString = array[15].replaceAll("[^0-9]","");
        int amount = 0;

        try {
            amount = Integer.parseInt(amountString);
        }

        catch(NumberFormatException nfe) {
            return;
        }

        IntWritable intW = new IntWritable(amount);

        context.write(new Intwritable(Year), new Text(amount +" "+ market));
    }

    public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, 
            InterruptedException {

        int count= 0;
        int amount =0;
        string market = "";
        for(IntWritable value : values) {
           market = value.toString().split(" ")[1];
           amount = Integer.parseInt(value.toString.split(" ")[0])
            if(count < 10){
              count ++;
              context.write(key, value);
          }
else
 break;
        }

       // context.write(key, new IntWritable(sum));
    }