使用“Order by”或“Group by”时查询Hive表时出错

时间:2018-01-20 00:38:08

标签: java hive

我最近安装了DBvisualizier来查询Hive表。我在我的Mac上安装了它,并从这个网站下载/安装了Hive的jdbc jar文件:https://s3.amazonaws.com/public-repo-1.hortonworks.com/HDP/hive-jdbc4/1.0.42.1054/Simba_HiveJDBC41_1.0.42.1054.zip

当我连接到我们的数据库并测试查询时。一个简单的选择将起作用:

select *

from table_name

limit 10

但是当我添加'order by'或'group by'时:

select *

from table_name

order by rollingtime

limit 10

我收到以下错误,我不知道为什么。有没有人有类似的错误,知道如何解决这个问题?

09:56:17 START Executing for: 'NewDev' [Hive], Database: Hive, Schema: sdc

09:56:17 FAILED [SELECT - 0 rows, 0.504 secs] [Code: 500051, SQL State: HY000] [Simba][HiveJDBCDriver](500051) ERROR processing query/statement. Error Code: 2, SQL state: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1516123265840_0008_8_00, diagnostics=[Task failed, taskId=task_1516123265840_0008_8_00_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : java.lang.NoClassDefFoundError: Could not initialize class org.apache.tez.runtime.library.api.TezRuntimeConfiguration

 at org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.start(OrderedPartitionedKVOutput.java:107)

 at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:186)

 at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:188)

 at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:172)

 at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)

 at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)

 at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)

 at java.security.AccessController.doPrivileged(Native Method)

 at javax.security.auth.Subject.doAs(Subject.java:422)

 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)

 at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)

 at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)

 at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)

 at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)

 at java.util.concurrent.FutureTask.run(FutureTask.java:266)

 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

 at java.lang.Thread.run(Thread.java:745)

, errorMessage=Cannot recover from this error:java.lang.NoClassDefFoundError: Could not initialize class org.apache.tez.runtime.library.api.TezRuntimeConfiguration

 at org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.start(OrderedPartitionedKVOutput.java:107)

 at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:186)

 at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:188)

 at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:172)

 at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)

 at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)

 at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)

 at java.security.AccessController.doPrivileged(Native Method)

 at javax.security.auth.Subject.doAs(Subject.java:422)

 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)

 at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)

 at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)

 at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)

 at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)

 at java.util.concurrent.FutureTask.run(FutureTask.java:266)

 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

 at java.lang.Thread.run(Thread.java:745)

]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:1, Vertex vertex_1516123265840_0008_8_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]Vertex killed, vertexName=Reducer 2, vertexId=vertex_1516123265840_0008_8_01, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:1, Vertex vertex_1516123265840_0008_8_01 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1, Query: select *

from nomura_qa_mblock_capacity_stage

order by rollingtime

limit 10.  

select *

from nomura_qa_mblock_capacity_stage

order by rollingtime

limit 10;

09:56:17 END Execution 1 statement(s) executed, 0 row(s) affected, exec/fetch time: 0.504/0.000 secs [0 successful, 1 errors]

1 个答案:

答案 0 :(得分:0)

有时这是由于通常由于路径错误或权限无效而无法写入已配置或经过硬编码的.staging目录引起的。

mapreduce exec引擎比tez引擎更冗长,可帮助您通过在Hive shell中运行此查询来识别罪魁祸首:

SET hive.execution.engine=mr

然后您可能会看到以下错误:

  

权限被拒绝:user = dbuser,access = WRITE,   inode =“ / user / dbuser / .staging”:hdfs:hdfs:drwxr-xr-x

在这种情况下,将“ dbuser”登台目录指定为不存在的路径,它应为/home/dbuser/.staging

在运行时,如前所示,在执行执行需要进行的操作(排序,排序,分组,分发等)以及将exec引擎设置为mr的任何查询之前,您需要先设置通过运行以下查询,进入有效父目录(例如用户的主目录)的临时路径:

SET yarn.app.mapreduce.am.staging-dir=/home/dbuser/.staging

根据版本和环境,如果该指令不起作用,则可以尝试

SET hive.exec.stagingdir=/home/dbuser/.staging

当然,请将“ dbuser”更改为实际的主目录(或具有读取/写入权限的任何其他目录)。假设运行查询的用户具有写访问权限,则将自动创建.staging目录。

更多信息,http://doc.mapr.com/display/MapR/Default+mapred+Parameters