从HIVE表中查询S3数据的单个列时获取FileNotFoundException

时间:2013-03-11 09:29:11

标签: hive

我使用以下语句在HIVE中创建了一个表。我的输入数据位于S3(s3n://test/hiveTest/01/)

CREATE external TABLE tests3(firstName STRING, lastName STRING) ROW FORMAT
DELIMITED FIELDS TERMINATED BY '\t'
STORED AS TEXTFILE
location 's3n://test/hiveTest/01/';

当我发出以下命令时,我可以看到结果数据。

  

蜂房> select * from tests3; OK第一个第二个第三个第四个   采取:1.647秒

但是当我从表中选择特定列时会出现以下错误

  

蜂房>从tests3中选择firstName;总MapReduce作业= 1启动   作业1中的1个减少任务数量设置为0,因为没有   reduce operator java.io.FileNotFoundException:文件不存在:   /tests3/hiveTest/01/abc.txt           在org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:736)           at org.apache.hadoop.mapred.lib.CombineFileInputFormat $ OneFileInfo。(CombineFileInputFormat.java:462)           在org.apache.hadoop.mapred.lib.CombineFileInputFormat.getMoreSplits(CombineFileInputFormat.java:256)           at org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:212)           at org.apache.hadoop.hive.shims.HadoopShimsSecure $ CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:387)           在org.apache.hadoop.hive.shims.HadoopShimsSecure $ CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:353)           at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:387)           在org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:989)           在org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:981)           在org.apache.hadoop.mapred.JobClient.access $ 500(JobClient.java:170)           在org.apache.hadoop.mapred.JobClient $ 2.run(JobClient.java:891)           在org.apache.hadoop.mapred.JobClient $ 2.run(JobClient.java:844)           at java.security.AccessController.doPrivileged(Native Method)           在javax.security.auth.Subject.doAs(Subject.java:396)           在org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)           在org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:844)           在org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:818)           在org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:452)           在org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136)           在org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)           在org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)           在org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)           在org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)           在org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)           在org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)           在org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)           在org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)           在org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)           在org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)           at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)           at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)           at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)           在java.lang.reflect.Method.invoke(Method.java:597)           在org.apache.hadoop.util.RunJar.main(RunJar.java:208)作业提交失败,异常' java.io.FileNotFoundException(File   不存在:/test/hiveTest/01/abc.txt)'失败:执行错误,   从org.apache.hadoop.hive.ql.exec.MapRedTask返回代码1

1 个答案:

答案 0 :(得分:4)

请在查询运行前尝试设置波纹管参数:

SET hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;