Hadoop二次排序

时间:2017-07-06 01:49:16

标签: java sorting hadoop mapper

我尝试实施二级排序, 并将该网址视为例如:https://www.safaribooksonline.com/library/view/data-algorithms/9781491906170/ch01.html

但我的问题不同,我有一个产品清单,年份和月份以及价格:

201505011000######PEN DRIVE00951
201505011000######PEN DRIVE00952
201505011000######PEN DRIVE00458
201505011000######PEN DRIVE00459
201505011000#######NOTEBOOK11470
201605011000#######NOTEBOOK21471
201705011000#######NOTEBOOK21472
201705011000###GAVETA DE HD01472
201703011000###GAVETA DE HD01473
201705011000###GAVETA DE HD01474

例如:201505表示年份和月份,在#符号后面我有产品名称,而在和价格01470代表14,70。

我需要做的是获得每种产品的较低价格,并显示该价格的年份和月份。但我不知道这样做,我能说的是较低的价格和产品。

这是我的计划:

MAPPER

public class GroupMR {
 public static class GroupMapper extends Mapper<LongWritable, Text, Product, IntWritable> {

    Product prdt = new Product();
    Text cntText = new Text();
    Text YearMonthText = new Text();
    IntWritable price = new IntWritable();

    public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {

        String line = value.toString();
        String produto = line.substring(13, 27);//Nome do produto
        produto = produto.substring(produto.lastIndexOf("#")+1);
        String ano = line.substring(0, 6);
        int valor = Integer.parseInt(line.substring(27, 32));
        cntText.set(new Text(produto));
        YearMonthText.set(ano);
        price.set(valor);
        Product prdt = new Product(cntText, YearMonthText);
        context.write(prdt, price);
    }
}

减速器

public static class GroupReducer extends Reducer<Product, IntWritable, Product, IntWritable> {

    public void reduce(Product key, Iterator<IntWritable> values, Context context) throws IOException,
            InterruptedException {
        int minValue = Integer.MAX_VALUE;

        while (values.hasNext()) {
            minValue = Math.min(minValue,values.next().get());
        }
        context.write(key, new IntWritable(minValue));
    }
}

COMPARABLE

private static class Product implements WritableComparable<Product> {

    Text Product;
    Text YearMonth;

    public Product(Text Product, Text YearMonth) {
        this.Product = Product;
        this.YearMonth = YearMonth;
    }
    public Product() {
        this.Product = new Text();
        this.YearMonth = new Text();
    }

     public void write(DataOutput out) throws IOException {
         this.Product.write(out);
         this.YearMonth.write(out);
     }

     public void readFields(DataInput in) throws IOException {
         this.Product.readFields(in);
         this.YearMonth.readFields(in);
     }

     public int compareTo(Product pric) {
        if (pric == null)
            return 0;
        int intcnt = Product.compareTo(pric.Product);
            return intcnt; 
    }

    @Override
    public String toString() {
        return Product.toString() + " DATA: " + YearMonth.toString();
    }
}

DRIVER

public static void main(String[] args) 
    throws IOException, ClassNotFoundException, InterruptedException {

    FileUtils.deleteDirectory(new File("/Local/data/output"));
    Configuration conf = new Configuration();
    Job job = Job.getInstance(conf, "GroupMR");
    job.setJarByClass(GroupMR.class);
    job.setMapperClass(GroupMapper.class);
    job.setReducerClass(GroupReducer.class);
    job.setOutputKeyClass(Product.class);
    job.setOutputValueClass(IntWritable.class);
    FileInputFormat.addInputPath(job, new Path(args[1]));
    FileOutputFormat.setOutputPath(job, new Path(args[2]));
    System.exit(job.waitForCompletion(true) ? 0 : 1);
}

RESULT

201605011000######PEN DRIVE00950
201505011000######PEN DRIVE00951
201505011000######PEN DRIVE00952
201505011000######PEN DRIVE00458
201505011000######PEN DRIVE00459
201505011000#######NOTEBOOK11470
201605011000#######NOTEBOOK21471
201705011000#######NOTEBOOK21472
201705011000###GAVETA DE HD01472
201703011000###GAVETA DE HD01473
201705011000###GAVETA DE HD01474

我认为问题出在Reduce和CompareTo但是我不知道如何制作。有人可以帮我吗?

0 个答案:

没有答案