是否所有内置可写入都使用默认的原始比较器?

时间:2013-09-25 04:49:14

标签: hadoop

我很困惑,所有内置的可写入内容如IntWritable,FloatWritable,GenericWritable等默认使用原始比较器进行比较?如果没有,我们应该如何注册它们以使用rawcomparator。

1 个答案:

答案 0 :(得分:2)

如何RawComparator位于JobConf.getOutputKeyComparator

  public RawComparator getOutputKeyComparator() {
    Class<? extends RawComparator> theClass = getClass("mapred.output.key.comparator.class",
            null, RawComparator.class);
    if (theClass != null)
      return ReflectionUtils.newInstance(theClass, this);
    return WritableComparator.get(getMapOutputKeyClass().asSubclass(WritableComparable.class));
  }

Hadoop会尝试从RawComparator获取mapred.output.key.comparator.class班级名称。如果未设置,hadoop将尝试将密钥类转换为WritableComparable,并使用它来创建WritableComparator。因此,如果我们未设置客户RawComparator,请输入WritableComparator.get

  public static synchronized 
  WritableComparator get(Class<? extends WritableComparable> c) {
    WritableComparator comparator = comparators.get(c);
    if (comparator == null) {
      // force the static initializers to run
      forceInit(c);
      // look to see if it is defined now
      comparator = comparators.get(c);
      // if not, use the generic one
      if (comparator == null) {
        comparator = new WritableComparator(c, true);
      }
    }
    return comparator;
  }

WritableComparator.get中,它会首先在地图WritableComparator中搜索comparators

大多数内置Writable,例如IntWritable,加载后,他们会调用defineWritableComparator(例如org.apache.hadoop.io.IntWritable.Comparator)放到{ {1}}。因此,如果您想注册自定义comparators,可以使用以下代码(您需要确保这些代码位于您的RawComparator课程正文中):

Writable

接下来,如果 static { // register this comparator WritableComparator.define(IntWritable.class, new Comparator()); } 未注册WritableComparable会怎样?这是WritableComparator的默认行为。它会调用WritableComparator来比较两个键。