带有Hive的NiFi PutHiveStreaming处理器:无法连接到EndPoint

时间:2017-08-31 14:13:02

标签: java hive apache-nifi orc hortonworks-dataflow

有人会在Nifi 1.3.0和Hive上帮助解决这个问题。我得到了与hive 1.2和Hive 2.1.1相同的错误。 hive表格 partioned bucketed 并存储为 ORC 格式。

分区是在hdfs上创建的,但数据在写入阶段失败。请检查日志如下:

[5:07 AM] papesdiop: Failed connecting to EndPoint {metaStoreUri='thrift://localhost:9083', database='mydb', table='guys', partitionVals=[dev] }
[5:13 AM] papesdiop: I get in log see next, hope it might help too:
[5:13 AM] papesdiop: Caused by: org.apache.hive.hcatalog.streaming.TransactionError: Unable to acquire lock on {metaStoreUri='thrift://localhost:9083', database='mydb', table='guys', partitionVals=[dev] }
  at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.beginNextTransactionImpl(HiveEndPoint.java:578)

完整跟踪日志:

重新连接。 org.apache.thrift.transport.TTransportException:null     在org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)     在org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)     在org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)     在org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)     在org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)     在org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)     在org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore $ Client.recv_lock(ThriftHiveMetastore.java:3906)     在org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore $ Client.lock(ThriftHiveMetastore.java:3893)     在org.apache.hadoop.hive.metastore.HiveMetaStoreClient.lock(HiveMetaStoreClient.java:1863)     at sun.reflect.GeneratedMethodAccessor380.invoke(Unknown Source)     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)     at java.lang.reflect.Method.invoke(Method.java:498)     在org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:152)     在com.sun.proxy。$ Proxy126.lock(未知来源)     在org.apache.hive.hcatalog.streaming.HiveEndPoint $ TransactionBatchImpl.beginNextTransactionImpl(HiveEndPoint.java:573)     在org.apache.hive.hcatalog.streaming.HiveEndPoint $ TransactionBatchImpl.beginNextTransaction(HiveEndPoint.java:547)     在org.apache.nifi.util.hive.HiveWriter.nextTxnBatch(HiveWriter.java:261)     在org.apache.nifi.util.hive.HiveWriter。(HiveWriter.java:73)     在org.apache.nifi.util.hive.HiveUtils.makeHiveWriter(HiveUtils.java:46)     在org.apache.nifi.processors.hive.PutHiveStreaming.makeHiveWriter(PutHiveStreaming.java:964)     在org.apache.nifi.processors.hive.PutHiveStreaming.getOrCreateWriter(PutHiveStreaming.java:875)     在org.apache.nifi.processors.hive.PutHiveStreaming.lambda $ null $ 40(PutHiveStreaming.java:676)     在org.apache.nifi.processor.util.pattern.ExceptionHandler.execute(ExceptionHandler.java:127)     在org.apache.nifi.processors.hive.PutHiveStreaming.lambda $ onTrigger $ 44(PutHiveStreaming.java:673)     at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2136)     at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2106)     在org.apache.nifi.processors.hive.PutHiveStreaming.onTrigger(PutHiveStreaming.java:627)     在org.apache.nifi.processors.hive.PutHiveStreaming.lambda $ onTrigger $ 36(PutHiveStreaming.java:551)     在org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:114)     在org.apache.nifi.processor.util.pattern.RollbackOnFailure.onTrigger(RollbackOnFailure.java:184)     在org.apache.nifi.processors.hive.PutHiveStreaming.onTrigger(PutHiveStreaming.java:551)     在org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1120)     在org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:147)     在org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)     在org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent $ 1.run(TimerDrivenSchedulingAgent.java:132)     at java.util.concurrent.Executors $ RunnableAdapter.call(Executors.java:511)     at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)     at java.util.concurrent.ScheduledThreadPoolExecutor $ ScheduledFutureTask.access $ 301(ScheduledThreadPoolExecutor.java:180)     at java.util.concurrent.ScheduledThreadPoolExecutor $ ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)     在java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)     at java.util.concurrent.ThreadPoolExecutor $ Worker.run(ThreadPoolExecutor.java:624)     在java.lang.Thread.run(Thread.java:748) 2017-09-07 06:41:31,015 DEBUG [Timer-4] o.a.n.processors.hive.PutHiveStreaming PutHiveStreaming [id = 13ed53d2-015e-1000-c7b1-5af434c38751]开始向所有作家发送心跳 2017-09-07 06:41:31,890 INFO [计时器驱动的进程线程-3] hive.metastore尝试使用URI thrift连接到Metastore:// localhost:9083 2017-09-07 06:41:31,893 INFO [计时器驱动的进程线程-3] hive.metastore连接到Metastore。 2017-09-07 06:41:31,911 ERROR [计时器驱动的进程线程-3] oanprocessors.hive.PutHiveStreaming PutHiveStreaming [id = 13ed53d2-015e-1000-c7b1-5af434c38751]无法为端点创建HiveWriter:{metaStoreUri = ' thrift:// localhost:9083',database ='默认',table =' guys',partitionVals = [dev]}:org.apache.nifi。 util.hive.HiveWriter $ ConnectFailure:连接到EndPoint失败{metaStoreUri =' thrift:// localhost:9083',database ='默认',table =' guys' ;,partitionVals = [dev]} org.apache.nifi.util.hive.HiveWriter $ ConnectFailure:连接到EndPoint失败{metaStoreUri =' thrift:// localhost:9083',database ='默认',table =& #39; guys',partitionVals = [dev]}     在org.apache.nifi.util.hive.HiveWriter。(HiveWriter.java:79)     在org.apache.nifi.util.hive.HiveUtils.makeHiveWriter(HiveUtils.java:46)     在org.apache.nifi.processors.hive.PutHiveStreaming.makeHiveWriter(PutHiveStreaming.java:964)     在org.apache.nifi.processors.hive.PutHiveStreaming.getOrCreateWriter(PutHiveStreaming.java:875)     在org.apache.nifi.processors.hive.PutHiveStreaming.lambda $ null $ 40(PutHiveStreaming.java:676)     在org.apache.nifi.processor.util.pattern.ExceptionHandler.execute(ExceptionHandler.java:127)     在org.apache.nifi.processors.hive.PutHiveStreaming.lambda $ onTrigger $ 44(PutHiveStreaming.java:673)     at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2136)     at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2106)     在org.apache.nifi.processors.hive.PutHiveStreaming.onTrigger(PutHiveStreaming.java:627)     在org.apache.nifi.processors.hive.PutHiveStreaming.lambda $ onTrigger $ 36(PutHiveStreaming.java:551)     在org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:114)     在org.apache.nifi.processor.util.pattern.RollbackOnFailure.onTrigger(RollbackOnFailure.java:184)     在org.apache.nifi.processors.hive.PutHiveStreaming.onTrigger(PutHiveStreaming.java:551)     在org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1120)     在org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:147)     在org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)     在org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent $ 1.run(TimerDrivenSchedulingAgent.java:132)     at java.util.concurrent.Executors $ RunnableAdapter.call(Executors.java:511)     at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)     at java.util.concurrent.ScheduledThreadPoolExecutor $ ScheduledFutureTask.access $ 301(ScheduledThreadPoolExecutor.java:180)     at java.util.concurrent.ScheduledThreadPoolExecutor $ ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)     在java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)     at java.util.concurrent.ThreadPoolExecutor $ Worker.run(ThreadPoolExecutor.java:624)     在java.lang.Thread.run(Thread.java:748) 引起:org.apache.nifi.util.hive.HiveWriter $ TxnBatchFailure:从EndPoint获取事务批处理失败:{metaStoreUri =' thrift:// localhost:9083',database ='默认&# 39;,table =' guys',partitionVals = [dev]}     在org.apache.nifi.util.hive.HiveWriter.nextTxnBatch(HiveWriter.java:264)     在org.apache.nifi.util.hive.HiveWriter。(HiveWriter.java:73)     ...省略了24个常用帧 引起:org.apache.hive.hcatalog.streaming.TransactionError:无法获取{metaStoreUri =' thrift:// localhost:9083',database ='默认',表=' guys',partitionVals = [dev]}     在org.apache.hive.hcatalog.streaming.HiveEndPoint $ TransactionBatchImpl.beginNextTransactionImpl(HiveEndPoint.java:578)     在org.apache.hive.hcatalog.streaming.HiveEndPoint $ TransactionBatchImpl.beginNextTransaction(HiveEndPoint.java:547)     在org.apache.nifi.util.hive.HiveWriter.nextTxnBatch(HiveWriter.java:261)     ...省略了25个常用帧 引起:org.apache.thrift.transport.TTransportException:null     在org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)     在org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)     在org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)     在org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)     在org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)     在org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)     在org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore $ Client.recv_lock(ThriftHiveMetastore.java:3906)     在org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore $ Client.lock(ThriftHiveMetastore.java:3893)     在org.apache.hadoop.hive.metastore.HiveMetaStoreClient.lock(HiveMetaStoreClient.java:1863)     at sun.reflect.GeneratedMethodAccessor380.invoke(Unknown Source)     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)     at java.lang.reflect.Method.invoke(Method.java:498)     在org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:152)     在com.sun.proxy。$ Proxy126.lock(未知来源)     在org.apache.hive.hcatalog.streaming.HiveEndPoint $ TransactionBatchImpl.beginNextTransactionImpl(HiveEndPoint.java:573)     ...省略了27个常见帧 2017-09-07 06:41:31,911 ERROR [计时器驱动的进程线程-3] oanprocessors.hive.PutHiveStreaming PutHiveStreaming [id = 13ed53d2-015e-1000-c7b1-5af434c38751]连接到Hive端点时出错:节俭表工作者://本地主机:9083 2017-09-07 06:41:31,911 DEBUG [计时器驱动的进程线程-3] o.a.n.processors.hive.PutHiveStreaming PutHiveStreaming [id = 13ed53d2-015e-1000-c7b1-5af434c38751]已选择产生其资源;将不会被安排再次运行1000毫秒 2017-09-07 06:41:31,912 ERROR [定时器驱动的进程线程-3] oanprocessors.hive.PutHiveStreaming PutHiveStreaming [id = 13ed53d2-015e-1000-c7b1-5af434c38751] Hive Streaming连接/写错误,流文件将受到处罚并被重审。 org.apache.nifi.util.hive.HiveWriter $ ConnectFailure:连接到EndPoint失败{metaStoreUri =' thrift:// localhost:9083',database ='默认',table =& #39; guys',partitionVals =

Hive表

  

CREATE TABLE mydb.guys(   firstname字符串,   lastname字符串)   分居(   job字符串)   聚集的(   名字)   INTO 10 BUCKETS   行格式SERDER   ' org.apache.hadoop.hive.ql.io.orc.OrcSerde'   存储为ORC   地点   ' HDFS://本地主机:9000 /用户/ papesdiop /人'   TBLPROPERTIES(' transactional' =' true')

提前致谢

1 个答案:

答案 0 :(得分:0)

如果在写入HDFS期间失败,那么您的用户可能没有权限写入目标目录?如果您有完整堆栈跟踪中的更多信息,请将其添加到您的问题中,因为它有助于诊断问题。我刚刚遇到这个问题时,是因为我的NiFi用户需要在目标操作系统上创建并添加到相应的HDFS组,以获得PutHiveStreaming写入ORC文件的权限在HDFS中。