K-Means迭代处理输出/集群-2失败

时间:2014-03-08 07:16:47

标签: java hadoop mahout k-means

我刚学习Hadoop几天,当我在Hadoop中执行Mahout in Action中的示例代码时,出现以下错误:

  

线程“main”中的异常java.lang.InterruptedException:K-Means   迭代处理输出/集群-2失败   org.apache.mahout.clustering.kmeans.KMeansDriver.runIteration(KMeansDriver.java:363)     在   org.apache.mahout.clustering.kmeans.KMeansDriver.buildClustersMR(KMeansDriver.java:310)     在   org.apache.mahout.clustering.kmeans.KMeansDriver.buildClusters(KMeansDriver.java:237)     在   org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:152)     在mia.chapter09.KMeansExample.main(KMeansExample.java:85)at   sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at   sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)     在   sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)     在java.lang.reflect.Method.invoke(Method.java:606)at   org.apache.hadoop.util.RunJar.main(RunJar.java:212)

代码段

Path path = new Path("testdata/clusters/part-00000");
SequenceFile.Writer writer = new SequenceFile.Writer(fs, conf,
    path, Text.class, Cluster.class);

for (int i = 0; i < k; i++) {
  Vector vec = vectors.get(i);
  Cluster cluster = new Cluster(vec, i, new EuclideanDistanceMeasure());
  writer.append(new Text(cluster.getIdentifier()), cluster);
}
writer.close();

KMeansDriver.run(conf, new Path("testdata/points"), new Path("testdata/clusters"),
  new Path("output"), new EuclideanDistanceMeasure(), 0.001, 10,
  true, false);

SequenceFile.Reader reader = new SequenceFile.Reader(fs,
    new Path("output/" + Cluster.CLUSTERED_POINTS_DIR
             + "/part-m-00000"), conf);

IntWritable key = new IntWritable();
WeightedVectorWritable value = new WeightedVectorWritable();
while (reader.next(key, value)) {
  System.out.println(value.toString() + " belongs to cluster "
                     + key.toString());
}
reader.close();

1 个答案:

答案 0 :(得分:0)

使用指定Mahout版本以及Hadoop 2.x或1.x等其他详细信息将有所帮助。

如果你使用的是Mahout 0.7或更早版本,建议切换到Mahout 0.9。