我很困惑,所有内置的可写入内容如IntWritable,FloatWritable,GenericWritable等默认使用原始比较器进行比较?如果没有,我们应该如何注册它们以使用rawcomparator。
答案 0 :(得分:2)
如何RawComparator
位于JobConf.getOutputKeyComparator:
public RawComparator getOutputKeyComparator() {
Class<? extends RawComparator> theClass = getClass("mapred.output.key.comparator.class",
null, RawComparator.class);
if (theClass != null)
return ReflectionUtils.newInstance(theClass, this);
return WritableComparator.get(getMapOutputKeyClass().asSubclass(WritableComparable.class));
}
Hadoop会尝试从RawComparator
获取mapred.output.key.comparator.class
班级名称。如果未设置,hadoop将尝试将密钥类转换为WritableComparable
,并使用它来创建WritableComparator
。因此,如果我们未设置客户RawComparator
,请输入WritableComparator.get。
public static synchronized
WritableComparator get(Class<? extends WritableComparable> c) {
WritableComparator comparator = comparators.get(c);
if (comparator == null) {
// force the static initializers to run
forceInit(c);
// look to see if it is defined now
comparator = comparators.get(c);
// if not, use the generic one
if (comparator == null) {
comparator = new WritableComparator(c, true);
}
}
return comparator;
}
在WritableComparator.get
中,它会首先在地图WritableComparator
中搜索comparators
。
大多数内置Writable
,例如IntWritable,加载后,他们会调用define
将WritableComparator
(例如org.apache.hadoop.io.IntWritable.Comparator
)放到{ {1}}。因此,如果您想注册自定义comparators
,可以使用以下代码(您需要确保这些代码位于您的RawComparator
课程正文中):
Writable
接下来,如果 static { // register this comparator
WritableComparator.define(IntWritable.class, new Comparator());
}
未注册WritableComparable
会怎样?这是WritableComparator的默认行为。它会调用WritableComparator
来比较两个键。