如果能帮助我,我将不胜感激。
我正在写一个关于Hadoop的程序。 Map输出键是类org.apache.mahout.clustering.kmeans.Kluster
,它不实现WritableComparable
。因此,我添加了
job.getConfiguration().setClass("mapred.output.key.comparator.class", KlusterComparable.class, RawComparator.class);
到我的代码。并定义KlusterComparable.class
如下:
public static class KlusterComparable implements RawComparator<Kluster>{
@Override
public int compare(Kluster k1, Kluster k2) {
Vector v1 = k1.getCenter();
Vector v2 = k2.getCenter();
int res = 0;
int vsize;
if(v1.size() < v2.size())
vsize = v2.size();
else
vsize = v1.size();
for(int i=0; i<vsize; i++){
if(v1.get(i) < v2.get(i)){
res = -1;
break;
}else if(v1.get(i) > v2.get(i)){
res = 1;
break;
}
}
return res;
}
@Override
public int compare(byte[] k1, int s1, int l1, byte[] k2,
int s2, int l2) {
Kluster kl1 = null;
Kluster kl2 = null;
byte[] b1 = Arrays.copyOfRange(k1, s1, s1+l1-1);
byte[] b2 = Arrays.copyOfRange(k1, s2, s2+l2-1);
try{
kl1 = (Kluster)(SerializationUtils.deserialize(b1));
kl2 = (Kluster)(SerializationUtils.deserialize(b2));
}catch(Exception ex){
System.out.println("Exception!!!");
}
return compare(kl1, kl2);
}
}
但是当我在Hadoop上运行jar时遇到错误:FAILED
java.io.IOException: Spill failed
当我发现异常时,我有代码打印Exception!!!
。
答案 0 :(得分:0)
Arrays.copyOfRange
的第三个参数是独占的。例如,原始数组为[0, 1, 2, 3, 4]
,Arrays.copyOfRange(a, 1, 3)
将获得[1, 2]
。
您的代码应为:
byte[] b1 = Arrays.copyOfRange(k1, s1, s1+l1);
byte[] b2 = Arrays.copyOfRange(k1, s2, s2+l2);
实际上,您可以从WritableComparator了解如何在Hadoop中进行比较。这是一个从中借鉴一些想法的考试。
public class KlusterComparator implements RawComparator<Kluster> {
private final Kluster key1;
private final Kluster key2;
private final DataInputBuffer buffer;
public KlusterComparator() {
key1 = new Kluster();
key2 = new Kluster();
buffer = new DataInputBuffer();
}
@Override
public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) {
try {
buffer.reset(b1, s1, l1); // parse key1
key1.readFields(buffer);
buffer.reset(b2, s2, l2); // parse key2
key2.readFields(buffer);
} catch (IOException e) {
throw new RuntimeException(e);
}
return compare(key1, key2); // compare them
}
@Override
public int compare(Kluster o1, Kluster o2) {
// compare o1 and o2
return 0;
}
}