多个映射器一个减速器hadoop

时间:2017-04-27 18:27:39

标签: java hadoop mapreduce mapper reducers

我需要阅读以下数据的两个文件: file1.txt是
2 3
3 2
4 1
5 1
1 1
和file2.txt是
1 2
3 4
5 1
2 4
4 5
2 5
2 3
3 2

现在我希望输出为
1 1:2
2 3:3,4,5
3 2:2,4
4 1:5
5 1:1

我的地图制作者如下:

public static class OutDegreeMapper1 
    extends Mapper<Object, Text, Text, Text>
{

    private Text word = new Text();
    private Text word2 = new Text();

    public void map(Object key, Text value, Context context
                        ) throws IOException, InterruptedException 
    {
        String oneLine = value.toString();
        String[] parts = oneLine.split(" ");
        word.set(parts[1]);
        word2.set(parts[2]);
        context.write(word, word2);
    }
}

public static class OutDegreeMapper2 
    extends Mapper<Object, Text, Text, Text>
{

    private Text word = new Text();
    private Text word2 = new Text();

    public void map(Object key, Text value, Context context
                        ) throws IOException, InterruptedException 
    {
        String oneLine = value.toString();
        String[] parts = oneLine.split("\t");
        word.set(parts[0]);
        word2.set(parts[1]);
        context.write(word, word2);
    }
}

我的减速机是

public static class OutDegreeReducer 
    extends Reducer<Text,Text,Text,Text> 
{
    //private IntWritable result = new IntWritable();
    private Text word = new Text();
    String merge ="";
    public void reduce(Text key, Iterable<Text> values, 
                                Context context
                        ) throws IOException, InterruptedException 
    {
        int i =0;
        for(Text value:values)
        {
            if(i == 0){
                merge = value.toString()+":";
            }
            else{
                merge += value.toString();
            }
        i++;
        }
        word.set(merge);
        context.write(key, word);
    }
}

现在的问题是我没有按要求顺序获取数据。有人告诉你如何获得所需的输出

current output, not sequential, wrong one

0 个答案:

没有答案