单词计数输出显示mapred而不是mapreduce

时间:2013-09-23 23:19:06

标签: java eclipse hadoop mapreduce

我刚配置我的Ubuntu 13.10以伪分布式模式工作,用于我的mapreduce代码开发。我安装了hadoop 0.20.2版本的hadoop。一切顺利,我也可以开始所有五个守护。

在同一台机器上我下载了eclipse并将所有基于hadoop的库添加到其中。我也可以直接从eclipse IDE运行我的地图缩减字数例子。唯一困扰我的是,当我运行我的单词计数示例时,它会在控制台中打印出类似这样的内容:

13/09/23 16:11:05 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your  
platform... using builtin-java classes where applicable
13/09/23 16:11:05 WARN mapred.JobClient: No job jar file set.  User classes may not be found. See    
JobConf(Class) or JobConf#setJar(String).
13/09/23 16:11:05 INFO input.FileInputFormat: Total input paths to process : 1
13/09/23 16:11:06 INFO mapred.JobClient: Running job: job_local_0001
13/09/23 16:11:06 INFO util.ProcessTree: setsid exited with exit code 0
13/09/23 16:11:06 INFO mapred.Task:  Using ResourceCalculatorPlugin : 
org.apache.hadoop.util.LinuxResourceCalculatorPlugin@c931fc
13/09/23 16:11:06 INFO mapred.MapTask: io.sort.mb = 100
13/09/23 16:11:07 INFO mapred.JobClient:  map 0% reduce 0%
13/09/23 16:11:07 INFO mapred.MapTask: data buffer = 79691776/99614720
13/09/23 16:11:07 INFO mapred.MapTask: record buffer = 262144/327680
13/09/23 16:11:08 INFO mapred.MapTask: Starting flush of map output
13/09/23 16:11:08 INFO mapred.MapTask: Finished spill 0
13/09/23 16:11:08 INFO mapred.Task: Task:attempt_local_0001_m_000000_0 is done. And is in the 
process of commiting
13/09/23 16:11:09 INFO mapred.LocalJobRunner: 
13/09/23 16:11:09 INFO mapred.Task: Task 'attempt_local_0001_m_000000_0' done.
13/09/23 16:11:09 INFO mapred.Task:  Using ResourceCalculatorPlugin : 
org.apache.hadoop.util.LinuxResourceCalculatorPlugin@1342ba4
13/09/23 16:11:09 INFO mapred.LocalJobRunner: 
13/09/23 16:11:09 INFO mapred.Merger: Merging 1 sorted segments
13/09/23 16:11:10 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total 
size: 48 bytes
13/09/23 16:11:10 INFO mapred.LocalJobRunner: 
13/09/23 16:11:10 INFO mapred.Task: Task:attempt_local_0001_r_000000_0 is done. And is in the 
process of commiting
13/09/23 16:11:10 INFO mapred.LocalJobRunner: 
13/09/23 16:11:10 INFO mapred.Task: Task attempt_local_0001_r_000000_0 is allowed to commit now
13/09/23 16:11:10 INFO output.FileOutputCommitter: Saved output of task 
'attempt_local_0001_r_000000_0' to outputWords
13/09/23 16:11:10 INFO mapred.JobClient:  map 100% reduce 0%
13/09/23 16:11:12 INFO mapred.LocalJobRunner: reduce > reduce
13/09/23 16:11:12 INFO mapred.Task: Task 'attempt_local_0001_r_000000_0' done.
13/09/23 16:11:12 WARN mapred.LocalJobRunner: job_local_0001

java.lang.NoClassDefFoundError: org/apache/commons/httpclient/HttpMethod
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:284)
Caused by: java.lang.ClassNotFoundException: org.apache.commons.httpclient.HttpMethod
at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
... 1 more
 Exception in thread "Thread-1" java.lang.NoClassDefFoundError:    
org/apache/commons/httpclient/HttpMethod
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:300)
 Caused by: java.lang.ClassNotFoundException: org.apache.commons.httpclient.HttpMethod
at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
... 1 more
13/09/23 16:11:13 INFO mapred.JobClient:  map 100% reduce 100%
13/09/23 16:11:13 INFO mapred.JobClient: Job complete: job_local_0001
13/09/23 16:11:13 INFO mapred.JobClient: Counters: 20
13/09/23 16:11:13 INFO mapred.JobClient:   File Output Format Counters 
13/09/23 16:11:13 INFO mapred.JobClient:     Bytes Written=42
13/09/23 16:11:13 INFO mapred.JobClient:   FileSystemCounters
13/09/23 16:11:13 INFO mapred.JobClient:     FILE_BYTES_READ=534
13/09/23 16:11:13 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=63640
13/09/23 16:11:13 INFO mapred.JobClient:   File Input Format Counters 
13/09/23 16:11:13 INFO mapred.JobClient:     Bytes Read=63
13/09/23 16:11:13 INFO mapred.JobClient:   Map-Reduce Framework
13/09/23 16:11:13 INFO mapred.JobClient:     Map output materialized bytes=52
13/09/23 16:11:13 INFO mapred.JobClient:     Map input records=4
13/09/23 16:11:13 INFO mapred.JobClient:     Reduce shuffle bytes=0
13/09/23 16:11:13 INFO mapred.JobClient:     Spilled Records=8
13/09/23 16:11:13 INFO mapred.JobClient:     Map output bytes=110
13/09/23 16:11:13 INFO mapred.JobClient:     Total committed heap usage (bytes)=231350272
13/09/23 16:11:13 INFO mapred.JobClient:     CPU time spent (ms)=0
13/09/23 16:11:13 INFO mapred.JobClient:     SPLIT_RAW_BYTES=124
13/09/23 16:11:13 INFO mapred.JobClient:     Combine input records=12
13/09/23 16:11:13 INFO mapred.JobClient:     Reduce input records=4
13/09/23 16:11:13 INFO mapred.JobClient:     Reduce input groups=4
13/09/23 16:11:13 INFO mapred.JobClient:     Combine output records=4
13/09/23 16:11:13 INFO mapred.JobClient:     Physical memory (bytes) snapshot=0
13/09/23 16:11:13 INFO mapred.JobClient:     Reduce output records=4
13/09/23 16:11:13 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=0
13/09/23 16:11:13 INFO mapred.JobClient:     Map output records=12

在上面的输出中,如果你看到的东西很少,我不确定是否正确:

  1. 它的打印mapred.JobClient:Mapred是hadoop的旧库,所以我怎样才能使它成为mapreduce(alreday将最近的新库添加到eclipse中仍然得到相同的mapred消息)
  2. 为什么会出现此错误:java.lang.NoClassDefFoundError
  3. 我也可以获得正确生成的输出目录。

    如果需要任何其他细节,请告诉我。

    希望得到答案。

    快乐的hadooping !!!

1 个答案:

答案 0 :(得分:1)

你到了 -

Exception in thread "Thread-1" java.lang.NoClassDefFoundError:    
org/apache/commons/httpclient/HttpMethod

因为您没有在类路径中包含一些依赖的jar。

尝试在lib /目录中包含以下内容并重试 -

commons-httpclient-3.1.jar
commons-cli-1.2.jar
commons-logging-1.0.4.jar
commons-logging-api-1.0.4.jar
log4j-1.2.15.jar
commons-cli-1.2.jar
jackson-core-asl-1.5.2.jar
jackson-mapper-asl-1.5.2.jar

如果包含这些不起作用,请在lib /目录中包含所有jar。

此外,Hadoop(mapred API或mapreduce API)都不会弃用和引用mapred.JobClient