hive hadoop:从表中选择数据获取错误

时间:2016-05-29 22:20:32

标签: hadoop hive hadoop-streaming hadoop-partitioning flume-twitter

我在Hive中创建外部表之后我想知道推文的数量,所以我写了以下查询但是我得到了这个错误,请问如何解决这个问题,这是mapred-site.xml的配置< / p>

    <configuration>

 <property>
    <name>mapred.job.tracker</name>
    <value>localhost:8021</value>
  </property>

hive> select count(*) from tweet;        
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_1464556774961_0005, Tracking URL = http://ubuntu:8088/proxy/application_1464556774961_0005/
Kill Command = /usr/local/hadoop/bin/hadoop job  -Dmapred.job.tracker=localhost:8021 -kill job_1464556774961_0005
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1
2016-05-29 15:14:24,207 Stage-1 map = 0%,  reduce = 0%
2016-05-29 15:14:30,496 Stage-1 map = 50%,  reduce = 0%, Cumulative CPU 1.91 sec
2016-05-29 15:14:31,532 Stage-1 map = 50%,  reduce = 0%, Cumulative CPU 1.91 sec
2016-05-29 15:14:32,558 Stage-1 map = 50%,  reduce = 0%, Cumulative CPU 1.91 sec
2016-05-29 15:14:33,592 Stage-1 map = 50%,  reduce = 0%, Cumulative CPU 1.91 sec
2016-05-29 15:14:34,625 Stage-1 map = 50%,  reduce = 0%, Cumulative CPU 1.91 sec
2016-05-29 15:14:35,649 Stage-1 map = 50%,  reduce = 0%, Cumulative CPU 1.91 sec
2016-05-29 15:14:36,676 Stage-1 map = 50%,  reduce = 0%, Cumulative CPU 1.91 sec
2016-05-29 15:14:37,697 Stage-1 map = 50%,  reduce = 0%, Cumulative CPU 1.91 sec
2016-05-29 15:14:38,720 Stage-1 map = 50%,  reduce = 0%, Cumulative CPU 1.91 sec
2016-05-29 15:14:39,745 Stage-1 map = 50%,  reduce = 0%, Cumulative CPU 1.91 sec
2016-05-29 15:14:40,776 Stage-1 map = 50%,  reduce = 17%, Cumulative CPU 2.14 sec
2016-05-29 15:14:41,804 Stage-1 map = 50%,  reduce = 17%, Cumulative CPU 2.14 sec
2016-05-29 15:14:42,823 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 1.91 sec
2016-05-29 15:14:43,847 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 1.91 sec
MapReduce Total cumulative CPU time: 1 seconds 910 msec
Ended Job = job_1464556774961_0005 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1464556774961_0005_m_000000 (and more) from job job_1464556774961_0005
Exception in thread "Thread-128" java.lang.RuntimeException: Error while reading from task log url
    at org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getStackTraces(TaskLogProcessor.java:240)
    at org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:227)
    at org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:92)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Server returned HTTP response code: 400 for URL: http://ubuntu:13562/tasklog?taskid=attempt_1464556774961_0005_m_000000_3&start=-8193
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1840)
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1441)
    at java.net.URL.openStream(URL.java:1045)
    at org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getStackTraces(TaskLogProcessor.java:192)
    ... 3 more
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched: 
Job 0: Map: 2  Reduce: 1   Cumulative CPU: 1.91 sec   HDFS Read: 277 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 1 seconds 910 msec
hive> 

1 个答案:

答案 0 :(得分:1)

这是因为hive-exec jar中缺少hive-storage-api模块。您需要将hive更新到最新版本以获取最新的配置单元修补程序。

临时修复是显式添加存储api jar。

add jar ./dist/hive/lib/hive-storage-api-2.0.0-SNAPSHOT.jar;