使用Tez的Hive,在作业中没有指定输入路径

时间:2015-07-06 07:08:24

标签: hive hadoop2 apache-tez

我使用过hadoop-0.20.x.x,hive-0.11.0。我会谈谈hive查询:使用指定的配置,每件事情都很好并且工作正常。 现在,我们升级到hadoop-2.6.x(hadoop2)和hive-0.14.x。也使用Apache Tez。

问题是,hadoop按原样工作。但是hive sql查询并没有。 以下查询在旧版本中运行正常。但是在升级版本中抛出错误: QUERY:SELECT abc.property_name, xyz.date, xyz.time, xyz.value_as_number, xyz.value_units FROM dbname.xyz JOIN dbname.abc ON (xyz.id = abc.src_id) WHERE xyz.person_id=138312;

异常:

INFO  : Session is already open
INFO  : Tez session was closed. Reopening...
INFO  : Session re-established.
INFO  :

INFO  : Status: Running (Executing on YARN cluster with App id application_1435524970199_0035)

INFO  : Map 1: -/-      Map 2: -/-
ERROR : Status: Failed
ERROR : Vertex failed, vertexName=Map 1, vertexId=vertex_1435524970199_0035_1_00, diagnostics=[Vertex vertex_1435524970199_0035_1_00 [Map 1] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: concept initializer failed, vertex=vertex_1435524970199_0035_1_00 [Map 1], java.io.IOException: No input paths specified in job
        at org.apache.hadoop.hive.ql.io.HiveInputFormat.getInputPaths(HiveInputFormat.java:318)
        at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:328)
        at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:130)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
]
ERROR : Vertex failed, vertexName=Map 2, vertexId=vertex_1435524970199_0035_1_01, diagnostics=[Vertex vertex_1435524970199_0035_1_01 [Map 2] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: observation initializer failed, vertex=vertex_1435524970199_0035_1_01 [Map 2], java.io.IOException: No input paths specified in job
        at org.apache.hadoop.hive.ql.io.HiveInputFormat.getInputPaths(HiveInputFormat.java:318)
        at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:328)
        at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:130)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
]
ERROR : DAG failed due to vertex failure. failedVertices:2 killedVertices:0
Error: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask (state=08S01,code=2)

异常说,No input path specified。好吧,我理解并知道如何在haodop-mapreduce程序中解决。但是,我们如何使用配置单元查询来完成它。无论如何,我不认为这是一样的。

要知道,我使用了hive shellbeeline shell,hive返回了预期的输出,但直线返回了与上面相同的异常。

问题的美妙之处在于对单个表的查询工作正常。但是,当我尝试使用JOIN时,它会抛出上述异常。 但是,据我所知,Apache Tez对我的查询有影响。有人可以建议解决方案或引脚点tez引用,所以我可以相应地读取和重写查询。感谢

1 个答案:

答案 0 :(得分:0)

通过禁用apache tez起作用。 看起来像apache tez还不稳定。