hadoop分区程序无法正常工作

时间:2014-10-26 20:29:04

标签: sorting hadoop mapreduce hadoop-partitioning

public class Partitioner_2 implements Partitioner<Text,Text>{

            @Override
            public int getPartition(Text key, Text value, int numPartitions) {
                    int hashValue=0;
                    for(char c: key.toString().split("\\|\\|")[0].toCharArray()){
                            hashValue+=(int)c;
                    }
                    return Math.abs(hashValue * 127) % numPartitions;
            }
    }

这是我的分区代码,密钥的格式为:

"str1||str2",我想将str1具有相同值的所有密钥发送到同一个reducer。

我的GroupComparator和KeyComparator如下:

public static class GroupComparator_2 extends WritableComparator {
                        protected GroupComparator_2() {
                                super(Text.class, true);
                        }


                        @Override
                        public int compare(WritableComparable w1, WritableComparable w2) {
                                Text kw1 = (Text) w1;
                                Text kw2 = (Text) w2;
                                String k1=kw1.toString().split("||")[0].trim();
                                String k2=kw2.toString().split("||")[0].trim();
                                return k1.compareTo(k2);
                        }
                }


public static class KeyComparator_2 extends WritableComparator {

                protected KeyComparator_2() {
                        super(Text.class, true);
                }
                @Override
                public int compare(WritableComparable w1, WritableComparable w2) {
                        Text key1 = (Text) w1;
                        Text key2 = (Text) w2;
                        String kw1_key1=key1.toString().split("||")[0];
                        String kw1_key2=key2.toString().split("||")[0];
                        int cmp=kw1_key1.compareTo(kw1_key2);
                        if(cmp==0){
                                String kw2_key1=key1.toString().split("||")[1].trim();
                                String kw2_key2=key2.toString().split("||")[1].trim();
                                cmp=kw2_key1.compareTo(kw2_key2);
                        }
                        return cmp;
                }
        }

我目前收到的错误是:

KeywordKeywordCoOccurrence_2.java:92: interface expected here
     public class Partitioner_2 implements Partitioner<Text,Text>{ 
                                                      ^
KeywordKeywordCoOccurrence_2.java:94: method does not override or implement a method from a supertype
        @Override
        ^
KeywordKeywordCoOccurrence_2.java:147: setPartitionerClass(java.lang.Class<? extends org.apache.hadoop.mapreduce.Partitioner>) in org.apache.hadoop.mapreduce.Job cannot be applied to (java.lang.Class<KeywordKeywordCoOccurrence_2.Partitioner_2>)
    job.setPartitionerClass(Partitioner_2.class);

但据我所知,我已经覆盖了getPartition()方法,这是Partitioner接口中唯一的方法?任何帮助确定我做错了什么以及如何解决它将非常感激。

提前致谢!

1 个答案:

答案 0 :(得分:0)

Partitioner是新mapreduce API中的一个抽象类(您显然正在使用它)。

所以你应该把它定义为:

public class Partitioner_2 extends Partitioner<Text, Text> {