mapreduce中输入文件的处理顺序是什么?

时间:2014-12-17 14:50:52

标签: java hadoop mapreduce

我使用 FileInputFormat.setInputPaths(作业,新路径(input1),新路径(input2)); 将两个输入文件添加到mapreduce程序,并假设程序在完成时将处理input2处理输入1。

但是当我更改顺序时,它会给出相同的结果。为什么?(input2的过程以完全处理input1为前提。)

public static class CalculateResultMapper extends Mapper<Object, Text, IntWritable, Text> {
    private final static IntWritable k = new IntWritable();
    private final static Text v = new Text();
    private final static Map<Integer, List<Cooccurrence>> matrix = 
            new HashMap<Integer, List<Cooccurrence>>();

    @Override
    public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
        String[] tokens = Recommend.DELIMITER.split(value.toString());
        String[] v1 = tokens[0].split(":");
        String[] v2 = tokens[1].split(":");

        //processing input1
        if(v1.length > 1) {
            int itemID1 = Integer.parseInt(v1[0]);
            int itemID2 = Integer.parseInt(v1[1]);
            int num = Integer.parseInt(tokens[1]);
            CalculateResult_4 c = new CalculateResult_4();
            List<Cooccurrence> list = null;
            if(!matrix.containsKey(itemID1)) {
                list = new ArrayList<Cooccurrence>();
            } else {
                list = matrix.get(itemID1);
            }
            list.add(c.new Cooccurrence(itemID1, itemID2, num));
            matrix.put(itemID1, list);
        }

        //processing input2
        if(v2.length > 1) {
            int itemID = Integer.parseInt(tokens[0]);
            int userID = Integer.parseInt(v2[0]);
            double pref = Double.parseDouble(v2[1]);
            k.set(userID);
            for(Cooccurrence co : matrix.get(itemID)) {
                v.set(co.getItemID2() + "," + pref * co.getNum());
                context.write(k, v);
            }
        }
    }

0 个答案:

没有答案