运行hadoop wordcount示例

时间:2018-03-02 00:00:20

标签: java hadoop hdfs text-mining word-count

我使用此命令在Hadoop中运行wordcound示例。

hadoop jar /usr/local/Cellar/hadoop/3.0.0/libexec/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.5.jar wordcount inputWiki/Wiki_data_100MB outputWiki0301

我收到如下错误消息。

2018-03-01 18:54:14,845 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-03-01 18:54:16,107 INFO beanutils.FluentPropertyBeanIntrospector: Error when creating PropertyDescriptor for public final void org.apache.commons.configuration2.AbstractConfiguration.setProperty(java.lang.String,java.lang.Object)! Ignoring this property.*

我之前使用过该命令运行类似的文件并且运行良好。有谁可以帮我这个?

更新以下结果:

  

pal-nat186-66-224:bin xujingjing $ hadoop jar   /usr/local/Cellar/hadoop/3.0.0/libexec/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.5.jar wordcount inputGurtenberg0302 / gurtenberg.txt outputGurtenberg0302   2018-03-02 17:23:58,961 WARN util.NativeCodeLoader:无法加载   适用于您平台的native-hadoop库...使用builtin-java类   适用时间2018-03-02 17:24:00,164 INFO   beanutils.FluentPropertyBeanIntrospector:创建时出错   PropertyDescriptor for public final void   org.apache.commons.configuration2.AbstractConfiguration.setProperty(java.lang.String中,java.lang.Object中)!   忽略此属性。 2018-03-02 17:24:00,226 INFO   impl.MetricsConfig:来自hadoop-metrics2.properties的已加载属性   2018-03-02 17:24:00,396 INFO impl.MetricsSystemImpl:计划的度量标准   快照周期为10秒。 2018-03-02 17:24:00,397 INFO   impl.MetricsSystemImpl:JobTracker度量系统已于2018-03-02启动   17:24:00,781 INFO mapreduce.JobSubmitter:清理临时区域   文件:/tmp/hadoop/mapred/staging/xujingjing1314852612/.staging/job_local1314852612_0001   org.apache.hadoop.mapreduce.lib.input.InvalidInputException:输入   路径不存在:   HDFS://本地主机:8020 /用户/ xujingjing / inputGurtenberg0302 / gurtenberg.txt     在   org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:330)     在   org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:272)     在   org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:394)     在   org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:313)     在   org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:330)     在   org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:203)     在org.apache.hadoop.mapreduce.Job $ 11.run(Job.java:1570)at at   org.apache.hadoop.mapreduce.Job $ 11.run(Job.java:1567)at at   java.base / java.security.AccessController.doPrivileged(Native方法)     在java.base / javax.security.auth.Subject.doAs(Subject.java:423)at   org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)     在org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)at   org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1588)at at   org.apache.hadoop.examples.WordCount.main(WordCount.java:87)at   java.base / jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(母语   方法)at   java.base / jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)     在   java.base / jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)     在java.base / java.lang.reflect.Method.invoke(Method.java:564)at   org.apache.hadoop.util.ProgramDriver $ ProgramDescription.invoke(ProgramDriver.java:71)     在org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)     在   org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)     在   java.base / jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(母语   方法)at   java.base / jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)     在   java.base / jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)     在java.base / java.lang.reflect.Method.invoke(Method.java:564)at   org.apache.hadoop.util.RunJar.run(RunJar.java:239)at at   org.apache.hadoop.util.RunJar.main(RunJar.java:153)   pal-nat186-66-224:bin xujingjing $

1 个答案:

答案 0 :(得分:0)

错误就是这个

  

输入路径不存在:hdfs:// localhost:8020 / user / xujingjing / inputGurtenberg0302 / gurtenberg.txt

所以使用

hdfs dfs -mkdir -p /user/xujingjing/inputGurtenberg0302/
hdfs dfs -copyFromLocal \
   /path/to/gurtenberg.txt \
   /user/xujingjing/inputGurtenberg0302/
  

我使用该命令在

之前运行了类似的文件

初始命令中的行使用完全不同的文件