Hive MapReduce作业提交失败“目标是一个目录”

时间:2013-11-29 06:40:56

标签: hadoop mapreduce hive hdfs yarn

我一直在玩Hadoop和它的姐妹项目,我一路上遇到了一些问题,但我终于遇到了一个我找不到的问题回答:

我有一个存储在hdfs上的hive表作为制表符分隔的文本文件。我可以在表格上做一个基本的选择,但是一旦我使查询变得更复杂,hive就会把它变成一个map reduce工作,它会因为下面的堆栈跟踪而失败

  

13/11/29 08:31:00 ERROR security.UserGroupInformation:PriviledgedActionException as:hduser(auth:SIMPLE)cause:java.io.IOException:Target / tmp / hadoop-> > yarn / staging / hduser / .staging / job_1385633903169_0013 / libjars / lib / lib是一个目录   13/11/29 08:31:00 ERROR security.UserGroupInformation:PriviledgedActionException as:hduser(auth:SIMPLE)cause:java.io.IOException:Target /tmp/hadoop-yarn/staging/hduser/.staging/job_1385633903169_0013/libjars / lib / lib是一个目录   java.io.IOException:目标/tmp/hadoop-yarn/staging/hduser/.staging/job_1385633903169_0013/libjars/lib/lib是一个目录           在org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:500)           在org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:502)           在org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:348)           在org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:338)           在org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:289)           在org.apache.hadoop.mapreduce.JobSubmitter.copyRemoteFiles(JobSubmitter.java:139)           在org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:212)           at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:300)           在org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:387)           在org.apache.hadoop.mapreduce.Job $ 10.run(Job.java:1268)           在org.apache.hadoop.mapreduce.Job $ 10.run(Job.java:1265)           at java.security.AccessController.doPrivileged(Native Method)           在javax.security.auth.Subject.doAs(Subject.java:415)           at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)           在org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)           在org.apache.hadoop.mapred.JobClient $ 1.run(JobClient.java:562)           在org.apache.hadoop.mapred.JobClient $ 1.run(JobClient.java:557)           at java.security.AccessController.doPrivileged(Native Method)           在javax.security.auth.Subject.doAs(Subject.java:415)           at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)           在org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)           在org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)           在org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:425)           在org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:144)           在org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)           在org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)           在org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1414)           在org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1192)           在org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1020)           在org.apache.hadoop.hive.ql.Driver.run(Driver.java:888)           在org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)           在org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)           在org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)           在org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781)           在org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)           在org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)           at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)           at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)           at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)           at java.lang.reflect.Method.invoke(Method.java:606)           在org.apache.hadoop.util.RunJar.main(RunJar.java:212)   作业提交失败,异常' java.io.IOException(目标/tmp/hadoop-yarn/staging/hduser/.staging/job_1385633903169_0013/libjars/lib/lib是一个目录)'   13/11/29 08:31:00 ERROR exec.Task:作业提交失败,异常' java.io.IOException(目标/tmp/hadoop-yarn/staging/hduser/.staging/job_1385633903169_0013/libjars/lib / lib是一个目录)'   java.io.IOException:目标/tmp/hadoop-yarn/staging/hduser/.staging/job_1385633903169_0013/libjars/lib/lib是一个目录           在org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:500)           在org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:502)           在org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:348)           在org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:338)           在org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:289)           在org.apache.hadoop.mapreduce.JobSubmitter.copyRemoteFiles(JobSubmitter.java:139)           在org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:212)           at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:300)           在org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:387)           在org.apache.hadoop.mapreduce.Job $ 10.run(Job.java:1268)           在org.apache.hadoop.mapreduce.Job $ 10.run(Job.java:1265)           at java.security.AccessController.doPrivileged(Native Method)           在javax.security.auth.Subject.doAs(Subject.java:415)           at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)           在org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)           在org.apache.hadoop.mapred.JobClient $ 1.run(JobClient.java:562)           在org.apache.hadoop.mapred.JobClient $ 1.run(JobClient.java:557)           at java.security.AccessController.doPrivileged(Native Method)           在javax.security.auth.Subject.doAs(Subject.java:415)           at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)           在org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)           在org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)           在org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:425)           在org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:144)           在org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)           在org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)           在org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1414)           在org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1192)           在org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1020)           在org.apache.hadoop.hive.ql.Driver.run(Driver.java:888)           在org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)           在org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)           在org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)           在org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781)           在org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)           在org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)           at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)           at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)           at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)           at java.lang.reflect.Method.invoke(Method.java:606)           在org.apache.hadoop.util.RunJar.main(RunJar.java:212)

有问题的文件夹确实存在于dfs上,至少是" / tmp / hadoop-yarn / staging"部分,无论我设置它的权限,hive或hadoop在作业提交时重置它们。真正令人关注的部分是完整路径似乎是生成的文件夹名称,那么为什么软件本身生成的东西有问题?为什么路径是目录的问题?它应该是什么呢?

编辑: 以下是我正在使用的表格以及我尝试运行的查询: 查询: select * from hive_flow_details where node_id = 100 limit 10;

表:

  

col_name data_type comment   id bigint无
  flow_versions_id int无   node_id int无   node_name string无

请记住,这种情况发生在我尝试的任何类型的where子句中,因为hive会将其转换为MR作业。

1 个答案:

答案 0 :(得分:0)

我最终解决了这个问题。我在我的类路径中找到了相互冲突的罐子,我清理过,从那以后我没有遇到任何问题。