您好我是Apache mahout的新手,我在运行“classify-20newsgroups.sh”时遇到错误,此示例自动从互联网获取数据集。
错误追踪:
hduser@raj-Lenovo-G550:/usr/local/mahout/examples$ bin/classify-20newsgroups.sh
Please select a number to choose the corresponding task to run
1. cnaivebayes
2. naivebayes
3. sgd
4. clean -- cleans up the work area in /tmp/mahout-work-hduser
Enter your choice : 3
ok. You chose 3 and we'll use sgd
creating work directory at /tmp/mahout-work-hduser
Downloading 20news-bydate
bin/classify-20newsgroups.sh: line 68: curl: command not found
Extracting...
tar (child): ../20news-bydate.tar.gz: Cannot open: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
Training on /tmp/mahout-work-hduser/20news-bydate/20news-bydate-train/
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using /usr/local/hadoop-1.2.1/bin/hadoop and HADOOP_CONF_DIR=/usr/local/hadoop-1.2.1/conf
MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.9-job.jar
14/08/06 14:07:53 WARN driver.MahoutDriver: No org.apache.mahout.classifier.sgd.TrainNewsGroups.props found on classpath, will use command-line arguments only
Exception in thread "main" java.lang.NullPointerException
at org.apache.mahout.classifier.sgd.TrainNewsGroups.main(TrainNewsGroups.java:106)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
任何团体都会在这里提供帮助
已编辑: 我试过了 使用sudo apt-get install curl但得到了
hduser@raj-Lenovo-G550:/usr/local/mahout/examples$ bin/classify-20newsgroups.sh
Please select a number to choose the corresponding task to run
1. cnaivebayes
2. naivebayes
3. sgd
4. clean -- cleans up the work area in /tmp/mahout-work-hduser
Enter your choice : 3
ok. You chose 3 and we'll use sgd
creating work directory at /tmp/mahout-work-hduser
Training on /tmp/mahout-work-hduser/20news-bydate/20news-bydate-train/
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using /usr/local/hadoop-1.2.1/bin/hadoop and HADOOP_CONF_DIR=/usr/local/hadoop-1.2.1/conf/
MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.9-job.jar
14/08/06 17:06:41 WARN driver.MahoutDriver: No org.apache.mahout.classifier.sgd.TrainNewsGroups.props found on classpath, will use command-line arguments only
Exception in thread "main" java.lang.NullPointerException
at org.apache.mahout.classifier.sgd.TrainNewsGroups.main(TrainNewsGroups.java:106)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
答案 0 :(得分:1)
这里的问题是无法使用20newsgroups
命令下载语料库curl
,因为它在操作系统中找不到,请查看以下行错误:bin/classify-20newsgroups.sh: line 68: curl: command not found
。