ArrayWritable是Hadoop MapReduce的关键

时间:2016-08-31 11:38:09

标签: hadoop mapreduce writable

我正在尝试创建一个动态地图缩小应用程序,该应用程序从外部属性文件中获取维度。主要问题在于变量即密钥将是复合的并且可以是任何数字,例如一对3个密钥,一对4个密钥等。

我的映射器:

public void map(AvroKey<flumeLogs> key, NullWritable value, Context context) throws IOException, InterruptedException{
    Configuration conf = context.getConfiguration();
    int dimensionCount = Integer.parseInt(conf.get("dimensionCount"));
    String[] dimensions = conf.get("dimensions").split(","); //this gets the dimensions from the run method in main

    Text[] values = new Text[dimensionCount]; //This is supposed to be my composite key

    for (int i=0; i<dimensionCount; i++){
        switch(dimensions[i]){

        case "region":  values[i] = new Text("-");
            break;

        case "event":  values[i] = new Text("-");
            break;

        case "eventCode":  values[i] = new Text("-");
            break;

        case "mobile":  values[i] = new Text("-");
        }
    }
    context.write(new StringArrayWritable(values), new IntWritable(1));

}

以后这些值会有很好的逻辑。

My StringArrayWritable:

public class StringArrayWritable extends ArrayWritable {
public StringArrayWritable() {
    super(Text.class);
}

public StringArrayWritable(Text[] values){
    super(Text.class, values);
    Text[] texts = new Text[values.length];
    for (int i = 0; i < values.length; i++) {
        texts[i] = new Text(values[i]);
    }
    set(texts);
}

@Override
public String toString(){
    StringBuilder sb = new StringBuilder();

    for(String s : super.toStrings()){
        sb.append(s).append("\t");
    }

    return sb.toString();
}
}

我得到的错误:

    Error: java.io.IOException: Initialization of all the collectors failed. Error in last collector was :class StringArrayWritable
    at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:414)
    at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:81)
    at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:698)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:770)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ClassCastException: class StringArrayWritable
    at java.lang.Class.asSubclass(Class.java:3165)
    at org.apache.hadoop.mapred.JobConf.getOutputKeyComparator(JobConf.java:892)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:1005)
    at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:402)
    ... 9 more

非常感谢任何帮助。

非常感谢。

1 个答案:

答案 0 :(得分:1)

You're trying to use a Writable object as the key. In mapreduce the key must implement the WritableComparable interface. ArrayWritable only implements the Writable interface.

The difference between the two is that the comaprable interface requires you to implement a compareTo method so that mapreduce is able to sort and group the keys correctly.