我是Mahout的新手,我有这个代码:
public class mahout {
public static final double[][] points = { {1, 1}, {2, 1}, {1, 2},{2, 2}, {3, 3}, {8, 8}, {9, 8}, {8, 9}, {9, 9}};
public static List<Vector> getPoints(double[][] raw) {
List<Vector> points = new ArrayList<Vector>();
for (int i = 0; i < raw.length; i++) {
double[] fr = raw[i];
Vector vec = new RandomAccessSparseVector(fr.length);
vec.assign(fr);
points.add(vec);
}
return points;
}
public static void main(String args[]) throws Exception {
int k = 2;
List<Vector> vectors = getPoints(points);
File testData = new File("testdata");
if (!testData.exists()) {
testData.mkdir();
}
testData = new File("testdata/points");
if (!testData.exists()) {
testData.mkdir();
}
Configuration conf = new Configuration();
FileSystem fs = FileSystem.get(conf);
ClusterHelper.writePointsToFile(vectors, conf, new Path("testdata/points/file1"));
Path path = new Path("testdata/clusters/part-00000");
SequenceFile.Writer writer = new SequenceFile.Writer(fs, conf,
path, Text.class, Kluster.class);
for (int i = 0; i < k; i++) {
Vector vec = vectors.get(i);
Kluster cluster = new Kluster(vec, i, new EuclideanDistanceMeasure());
writer.append(new Text(cluster.getIdentifier()), cluster);
}
writer.close();
Path output = new Path("output");
HadoopUtil.delete(conf, output);
KMeansDriver.run(conf, new Path("testdata/points"), new Path("testdata/clusters"),
output, new EuclideanDistanceMeasure(), 0.001, 10,
true, 0.0,false);
SequenceFile.Reader reader = new SequenceFile.Reader(fs,
new Path("output/" + Kluster.CLUSTERED_POINTS_DIR
+ "/part-m-00000"), conf);
IntWritable key = new IntWritable();
WeightedVectorWritable value = new WeightedVectorWritable();
while (reader.next(key, value)) {
System.out.println(value.toString() + " belongs to cluster "
+ key.toString());
}
reader.close();
}
}
但是当我运行代码时出现这些错误:
24-ott-2013 9.50.25 org.apache.hadoop.util.NativeCodeLoader <clinit>
AVVERTENZA: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
24-ott-2013 9.50.25 org.slf4j.impl.JCLLoggerAdapter info
INFO: Deleting output
24-ott-2013 9.50.25 org.slf4j.impl.JCLLoggerAdapter info
INFO: Input: testdata/points Clusters In: testdata/clusters Out: output Distance: org.apache.mahout.common.distance.EuclideanDistanceMeasure
24-ott-2013 9.50.25 org.slf4j.impl.JCLLoggerAdapter info
INFO: convergence: 0.0010 max Iterations: 10
24-ott-2013 9.50.25 org.apache.hadoop.security.UserGroupInformation doAs
GRAVE: PriviledgedActionException as:hp cause:java.io.IOException: Failed to set permissions of path: \tmp\hadoop-hp\mapred\staging\hp1776229724\.staging to 0700
Exception in thread "main" java.io.IOException: Failed to set permissions of path: \tmp\hadoop-hp\mapred\staging\hp1776229724\.staging to 0700
at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:509)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:344)
at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:189)
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:116)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:918)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:912)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Unknown Source)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:912)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:500)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530)
at org.apache.mahout.clustering.iterator.ClusterIterator.iterateMR(ClusterIterator.java:182)
at org.apache.mahout.clustering.kmeans.KMeansDriver.buildClusters(KMeansDriver.java:223)
at org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:143)
at mahout.main(mahout.java:69)
问题出在哪里?我该如何解决?
答案 0 :(得分:0)
在Windows上运行Hadoop时出现问题。
您可以看到针对此特定问题的一些JIRA问题:
https://issues.apache.org/jira/browse/HADOOP-7682
https://issues.apache.org/jira/browse/HADOOP-8089
唯一的解决方法是使用此补丁修补Hadoop:
https://github.com/congainc/patch-hadoop_7682-1.0.x-win
或升级到本机在Windows上运行的Hadoop 2.2。
答案 1 :(得分:-1)
看来问题是
Failed to set permissions of path: \tmp\hadoop-hp\mapred\staging\hp1776229724.staging to 0700
检查运行代码的用户是否对堆栈跟踪中提到的目录拥有足够的权限。
还有踪迹
Unable to load native-hadoop library for your platform...
真的让我担心没有什么可以运行得很好^^