Hive使用Talend将文件从HDFS加载到hive表中失败

时间:2014-09-04 15:12:10

标签: hadoop hive cloudera talend

我使用talend 5.4 /5.5连接到cdh 5.1。三节点集群

N1:CM,HIVE(所有服务),Datanode,Zookeeper ....等 N2:RM,Datanode N3:Datanode

当我尝试将数据从hdfs加载到hive表时失败,因为来自cli的相同命令工作正常。

hive> LOAD DATA  INPATH '/user/thor/test/rev_sub.txt' INTO TABLE revenue_subs;

当我使用tHiveLoad组件运行talend作业时,我遇到异常

[INFO ]: hive.metastore - Trying to connect to metastore with URI thrift://txwlcloud1:9083
[WARN ]: org.apache.hadoop.security.UserGroupInformation - No groups available for user thor
[INFO ]: hive.metastore - Waiting 1 seconds before next connection attempt.
[INFO ]: hive.metastore - Connected to metastore.
[ERROR]: org.apache.hadoop.hive.ql.Driver - FAILED: SemanticException Line 1:17 Invalid path ''/user/thor/test/rev_sub.txt''
org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:17 Invalid path ''/user/thor/test/rev_sub.txt''
at org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.applyConstraints(LoadSemanticAnalyzer.java:148)
at org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.analyzeInternal(LoadSemanticAnalyzer.java:229)
at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:459)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:349)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:355)
at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:110)
at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:82)
at org.apache.hive.service.cli.operation.SQLOperation.run(SQLOperation.java:129)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:209)
at org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:154)
at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:191)
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:197)
at big_data.hivejob_0_1.HIVEJob.tHiveLoad_1Process(HIVEJob.java:375)
at big_data.hivejob_0_1.HIVEJob.runJobInTOS(HIVEJob.java:645)
at big_data.hivejob_0_1.HIVEJob.main(HIVEJob.java:504)
Caused by: java.io.IOException: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Message missing required fields: callId, status; Host Details : local host is: "TXWLHPW295/10.215.206.241"; destination host is: "txwlcloud2":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:763)
at org.apache.hadoop.ipc.Client.call(Client.java:1241)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source)

我正在努力解决这个问题一段时间。

可能的原因可能是 1)jdbc驱动问题。我必须将jdbc驱动程序jar放在集群中的某个位置吗?或者它已经存在? 2)与远程Metastore有关的事情

如果你们能指出负载失败的原因,那将会很有帮助 thiveload component

当我beeline> !connect jdbc:hive2://10.215.204.xyz:10000 thor org.apache.hive.jdbc.HiveDriver时,它返回正确的连接。

谢谢, 阿米特

0 个答案:

没有答案