1.seqdirectory
mahout seqdirectory --input /user/hdfs/input/new1.csv --output / user / hdfs / new1 / seqdirectory --tempDir /用户/ HDFS /名new1 / seqdirectory /的tempDir
2.seq2sparse
mahout seq2sparse --input / user / hdfs / new1 / seqdirectory --output / user / hdfs / new1 / seq2sparse -wt tfidf
3.kmeans
mahout kmeans --input / user / hdfs / new1 / seq2sparse / tfidf-vectors --output / user / hdfs / new1 / kmeans -c / user / hdfs / new1 / clusters / kmeans -x 3 -k 3 --tempDir / user / hdfs / new1 / kmeans / tempDir
然后发生错误
Failing Oozie Launcher, Main class [org.apache.mahout.driver.MahoutDriver], main() threw exception, No input clusters found in /user/oozie/mahout/new1/clusters/part-randomSeed. Check your -c argument.
java.lang.IllegalStateException: No input clusters found in /user/oozie/mahout/new1/clusters/part-randomSeed. Check your -c argument.
at org.apache.mahout.clustering.kmeans.KMeansDriver.buildClusters(KMeansDriver.java:217)
at org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:148)
at org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:107)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.mahout.clustering.kmeans.KMeansDriver.main(KMeansDriver.java:48)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:467)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Oozie Launcher failed, finishing Hadoop job gracefully
Oozie Launcher ends
为什么kmeans驱动程序无法使用oozie系统在Hadoop中创建集群? 在没有oozie系统的hadoop中,它起作用了。
影响版本/秒:MAHOUT 0.7,0.8-SNAPSHOT
答案 0 :(得分:0)
使用-c / user / hdfs / new1 / clusters / kmeans选项, 我们要求确保群集可用。
这可以帮助https://cwiki.apache.org/confluence/display/MAHOUT/K-Means+Clustering 请检查c选项。