hadoop中的总订单分区程序

时间:2013-10-14 06:08:35

标签: hadoop mapreduce mapper reducers

我对totalorderpartitioner的概念完全陌生,我已经应用了这个概念但是我没有成功地产生全局排序。 这是我的输入记录

676576
7489768576
689576857867857
685768578678578675
765897685789675879679587
1
5
6
7
8
9
0
2
3
5
6
9

676576 7489768576 689576857867857 685768578678578675 765897685789675879679587 1 5 6 7 8 9 0 2 3 5 6 9

这是我的映射器

public void map(LongWritable key, Text value,
            OutputCollector<NullWritable, Text> outputCollector, Reporter reporter) throws IOException {
        // TODO Auto-generated method stub
        outputCollector.collect(NullWritable.get(),value);

    }

这是我的减速机

public void map(LongWritable key, Text value, OutputCollector<NullWritable, Text> outputCollector, Reporter reporter) throws IOException { // TODO Auto-generated method stub outputCollector.collect(NullWritable.get(),value); }

这是我的工作相关代码

public void reduce(NullWritable key, Iterator<Text> values,
            OutputCollector<NullWritable, Text> outputCollector, Reporter reporter) throws IOException {
        // TODO Auto-generated method stub
        while (values.hasNext()) {
            Text text = (Text) values.next();
            outputCollector.collect(key,text);

        }

    }

public void reduce(NullWritable key, Iterator<Text> values, OutputCollector<NullWritable, Text> outputCollector, Reporter reporter) throws IOException { // TODO Auto-generated method stub while (values.hasNext()) { Text text = (Text) values.next(); outputCollector.collect(key,text); } }

但记录没有排序。这是我的输出

JobConf jobConf = new JobConf();
jobConf.setMapperClass(TotalOrderMapper.class);
jobConf.setReducerClass(TotalOrderReducer.class);
jobConf.setMapOutputKeyClass(NullWritable.class);
jobConf.setMapOutputValueClass(Text.class);
jobConf.setOutputKeyClass(NullWritable.class);
jobConf.setOutputValueClass(Text.class);
jobConf.setPartitionerClass(TotalOrderPartitioner.class);
jobConf.setInputFormat(TextInputFormat.class);
jobConf.setOutputFormat(TextOutputFormat.class);
FileInputFormat.addInputPath(jobConf, new Path("hdfs://localhost:9000/totalorderset.txt"));
FileOutputFormat.setOutputPath(jobConf, new Path("hdfs://localhost:9000//sortedRecords5.txt"));
Path pa = new Path("hdfs://localhost:9000//partitionfile","_partitions.lst");
TotalOrderPartitioner.setPartitionFile(jobConf,
        pa);
InputSampler.writePartitionFile(jobConf,
        new InputSampler.RandomSampler(1,1));
JobClient.runJob(jobConf);

JobConf jobConf = new JobConf(); jobConf.setMapperClass(TotalOrderMapper.class); jobConf.setReducerClass(TotalOrderReducer.class); jobConf.setMapOutputKeyClass(NullWritable.class); jobConf.setMapOutputValueClass(Text.class); jobConf.setOutputKeyClass(NullWritable.class); jobConf.setOutputValueClass(Text.class); jobConf.setPartitionerClass(TotalOrderPartitioner.class); jobConf.setInputFormat(TextInputFormat.class); jobConf.setOutputFormat(TextOutputFormat.class); FileInputFormat.addInputPath(jobConf, new Path("hdfs://localhost:9000/totalorderset.txt")); FileOutputFormat.setOutputPath(jobConf, new Path("hdfs://localhost:9000//sortedRecords5.txt")); Path pa = new Path("hdfs://localhost:9000//partitionfile","_partitions.lst"); TotalOrderPartitioner.setPartitionFile(jobConf, pa); InputSampler.writePartitionFile(jobConf, new InputSampler.RandomSampler(1,1)); JobClient.runJob(jobConf);

我不知道我哪里出错了。有人能帮我解决问题吗?有人能告诉我输入采样是如何正常工作的。提前谢谢

1 个答案:

答案 0 :(得分:2)

Mapreduce对键进行排序,因为您在nullwritable上进行排序,所以根本不进行排序:

outputCollector.collect(NullWritable.get(),value);

您的地图输出键应该是您的输入值!

outputCollector.collect(value, NullWritable.get());

你可以尝试一下,让我们知道它是否有效?

第二句话:使用 IntWritable 而不是Text,否则排序将按字典顺序排列!