使用mapreduce查找数据集的avg / min / max

时间:2016-04-12 20:02:02

标签: mapreduce

我正在尝试编写一个mapreduce示例练习程序,在我的数据集中是这样的事情

关于每个国家/城市/州的人们的工资

place      year  salary($)
america    2014  60,000
france     2010  40,000
india      2012  20,000
australia  2001  50,000
america    2014  65,000

我希望输出类似这样的东西

place   year    avg      min        max
america 2014   625000   600000    650000 
france  2010   400000    400000    400000

请指导我如何编写mapreduce程序/已经处理过这种情况的任何示例程序。 提前谢谢:)

我试过映射器部分     public static class Map扩展了Mapper {

    public void map(LongWritable key, Text value,
            Context context)
            throws IOException,InterruptedException {
        String year=null;
        String country =null;
        String amount=null;

          // this will work even if we receive more than 1 line
        Scanner scanner = new Scanner(value.toString());
        String line;
        String[] tokens;
        while (scanner.hasNext()) {
            line = scanner.nextLine();
            tokens = line.split("\\s+");
            country = tokens[0];
            year = tokens[1];
            amount = (tokens[2]);

            context.write(new Text(country), new Text(year));
            context.write(new Text(year), new Text(amount));
        }



    }

}

0 个答案:

没有答案