我尝试在EMR上使用1个主(小)和1个从(小)节点运行朴素的byes算法。我使用seqdirectory,seq2sparse和split命令成功完成了步骤。但是在训练阶段我遇到了错误。我使用以下命令来训练算法:
./elastic-mapreduce --jar s3n://<bucket name>/mahout/mahout-examples-0.7-job.jar \
--main-class org.apache.mahout.driver.MahoutDriver \
--logs \
--arg trainnb \
--arg -i --arg /<folder name>/mahout/review-train-vectors/ --arg -el\
--arg -o --arg /<folder name>/mahout/model/ \
--arg -li --arg /<folder name>/mahout/labelindex/ \
--arg -ow \
-j <job-name>
以下是工作步骤的日志:
java.lang.IllegalArgumentException
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:76)
at org.apache.mahout.classifier.naivebayes.training.WeightsMapper.setup(WeightsMapper.java:42)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:771)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:375)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
attempt_201302130846_0035_m_000000_0: SLF4J: Class path contains multiple SLF4J bindings.
attempt_201302130846_0035_m_000000_0: SLF4J: Found binding in [jar:file:/home/hadoop /lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
attempt_201302130846_0035_m_000000_0: SLF4J: Found binding in [jar:file:/mnt/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201302130846_0035/jars/job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
attempt_201302130846_0035_m_000000_0: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
java.lang.IllegalArgumentException
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:76)
at org.apache.mahout.classifier.naivebayes.training.WeightsMapper.setup(WeightsMapper.java:42)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:771)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:375)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
attempt_201302130846_0035_m_000000_1: SLF4J: Class path contains multiple SLF4J bindings.
attempt_201302130846_0035_m_000000_1: SLF4J: Found binding in [jar:file:/home/hadoop /lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
attempt_201302130846_0035_m_000000_1: SLF4J: Found binding in [jar:file:/mnt/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201302130846_0035/jars/job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
attempt_201302130846_0035_m_000000_1: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
java.lang.IllegalArgumentException
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:76)
at org.apache.mahout.classifier.naivebayes.training.WeightsMapper.setup(WeightsMapper.java:42)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:771)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:375)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
attempt_201302130846_0035_m_000000_2: SLF4J: Class path contains multiple SLF4J bindings.
attempt_201302130846_0035_m_000000_2: SLF4J: Found binding in [jar:file:/home/hadoop/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
attempt_201302130846_0035_m_000000_2: SLF4J: Found binding in [jar:file:/mnt/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201302130846_0035/jars/job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
attempt_201302130846_0035_m_000000_2: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
以前有人试过这个东西吗?请帮我解决这个问题。当我在本地系统上使用hadoop伪分布式模式运行此算法时,我也遇到了同样的问题。此算法仅适用于MAHOUT_LOCAL = True环境变量。
答案 0 :(得分:1)
命令的参数存在问题。看起来您复制并粘贴命令而不根据您的环境进行调整:
--jar s3n://<bucket name>/mahout/mahout-examples-0.7-job.jar
什么是桶名?
--arg -i --arg /<folder name>/mahout/review-train-vectors/
<folder name>
看起来像你应该根据你的情况改变的变量
-j <job-name>
同样的错误。看来你不是一个经验丰富的linux用户,要注意每行末尾的字符\
应该被跳过(很可能是在你接过命令的网页上。页面更易读(你确定它是一个命令 - 在许多行上没有多少命令):))