使用HIPI运行hadoop .. -libjars

时间:2012-05-05 17:16:42

标签: java hadoop cloudera

我是java的新手并尝试运行使用HIPI的MR:http://hipi.cs.virginia.edu/ 我已经使用了如下所述的命令: http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html 我正在使用hadoop 0.20.2

我的命令如下: hadoop jar grayscalefromfile_exc.jar grayscalefromfile_exc.StubDriver -libjars hipi-0.0.1.jar imgs imgsOut1

路径如下:

 --
   --grayscalefromfile_exc.jar
   --hipi-0.0.1.jar

我得到的错误: 线程“main”中的异常java.lang.NoClassDefFoundError:hipi / imagebundle / mapreduce / ImageBundleInputFormat         at grayscalefromfile_exc.StubDriver.run(StubDriver.java:89)         在org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)         在org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)         at grayscalefromfile_exc.StubDriver.main(StubDriver.java:103)         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)         在java.lang.reflect.Method.invoke(Method.java:597)         在org.apache.hadoop.util.RunJar.main(RunJar.java:186) 引起:java.lang.ClassNotFoundException:hipi.imagebundle.mapreduce.ImageBundleInputFormat         在java.net.URLClassLoader $ 1.run(URLClassLoader.java:202)         at java.security.AccessController.doPrivileged(Native Method)         在java.net.URLClassLoader.findClass(URLClassLoader.java:190)         at java.lang.ClassLoader.loadClass(ClassLoader.java:306)         at java.lang.ClassLoader.loadClass(ClassLoader.java:247)         ... 9更多

毋庸置疑,hipi-0.0.1.jar,里面有路径:hipi / imagebundle / mapreduce / ImageBundleInputFormat.java

TNX

2 个答案:

答案 0 :(得分:1)

libjars将给定的jar上传到集群,然后在每个mapper / reducer实例的类路径中使它们可用

如果要将其他jar添加到驱动程序客户机类路径,则需要使用HADOOP_CLASSPATH env变量:

#> export HADOOP_CLASSPATH=hipi-0.0.1.jar
#> hadoop jar grayscalefromfile_exc.jar grayscalefromfile_exc.StubDriver -libjars hipi-0.0.1.jar imgs imgsOut1

我运行时的输出(错误与我没有hipi图像包文件有关):

cswhite@Studio-1555:~/workspace/sandbox/so-hipi/target$ export $HADOOP_CLASSPATH=/home/cswhite/Downloads/hipi-0.0.1.jar
cswhite@Studio-1555:~/workspace/sandbox/so-hipi/target$ echo $HADOOP_CLASSPATH
/home/cswhite/Downloads/hipi-0.0.1.jar
cswhite@Studio-1555:~/workspace/sandbox/so-hipi/target$ hadoop jar so-hipi-0.0.1-SNAPSHOT.jar StubDriver -libjars ~/Downloads/hipi-0.0.1.jar images output
num of args: 2:images,output
****hdfs://localhost:9000/user/cswhite/images
12/05/14 14:06:34 INFO input.FileInputFormat: Total input paths to process : 1
12/05/14 14:06:34 INFO mapred.JobClient: Cleaning up the staging area hdfs://localhost:9000/tmp/hadoop-hadoop/mapred/staging/cswhite/.staging/job_201205141351_0003
12/05/14 14:06:34 ERROR security.UserGroupInformation: PriviledgedActionException as:cswhite cause:java.io.IOException: not a hipi image bundle
Exception in thread "main" java.io.IOException: not a hipi image bundle
    at hipi.imagebundle.HipiImageBundle.readBundleHeader(HipiImageBundle.java:322)
    at hipi.imagebundle.HipiImageBundle.openForRead(HipiImageBundle.java:388)
    at hipi.imagebundle.AbstractImageBundle.open(AbstractImageBundle.java:82)
    at hipi.imagebundle.AbstractImageBundle.open(AbstractImageBundle.java:55)
    at hipi.imagebundle.mapreduce.ImageBundleInputFormat.getSplits(ImageBundleInputFormat.java:61)
    at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:962)
    at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:979)
    at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:897)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:416)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
    at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:500)
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530)
    at StubDriver.run(StubDriver.java:53)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at StubDriver.main(StubDriver.java:57)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

答案 1 :(得分:0)

我能够通过在主类

中使用以下API来解决类似的问题
DistributedCache.addFileToClassPath(new Path("/path/application.jar"), conf);

jar必须存在于hdfs路径/path/application.jar中。