有人会在Nifi 1.3.0和Hive上帮助解决这个问题。我得到了与hive 1.2和Hive 2.1.1相同的错误。 hive表格 partioned , bucketed 并存储为 ORC 格式。
分区是在hdfs上创建的,但数据在写入阶段失败。请检查日志如下:
[5:07 AM] papesdiop: Failed connecting to EndPoint {metaStoreUri='thrift://localhost:9083', database='mydb', table='guys', partitionVals=[dev] }
[5:13 AM] papesdiop: I get in log see next, hope it might help too:
[5:13 AM] papesdiop: Caused by: org.apache.hive.hcatalog.streaming.TransactionError: Unable to acquire lock on {metaStoreUri='thrift://localhost:9083', database='mydb', table='guys', partitionVals=[dev] }
at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.beginNextTransactionImpl(HiveEndPoint.java:578)
完整跟踪日志:
重新连接。 org.apache.thrift.transport.TTransportException:null 在org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) 在org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) 在org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) 在org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) 在org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) 在org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) 在org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore $ Client.recv_lock(ThriftHiveMetastore.java:3906) 在org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore $ Client.lock(ThriftHiveMetastore.java:3893) 在org.apache.hadoop.hive.metastore.HiveMetaStoreClient.lock(HiveMetaStoreClient.java:1863) at sun.reflect.GeneratedMethodAccessor380.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) 在org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:152) 在com.sun.proxy。$ Proxy126.lock(未知来源) 在org.apache.hive.hcatalog.streaming.HiveEndPoint $ TransactionBatchImpl.beginNextTransactionImpl(HiveEndPoint.java:573) 在org.apache.hive.hcatalog.streaming.HiveEndPoint $ TransactionBatchImpl.beginNextTransaction(HiveEndPoint.java:547) 在org.apache.nifi.util.hive.HiveWriter.nextTxnBatch(HiveWriter.java:261) 在org.apache.nifi.util.hive.HiveWriter。(HiveWriter.java:73) 在org.apache.nifi.util.hive.HiveUtils.makeHiveWriter(HiveUtils.java:46) 在org.apache.nifi.processors.hive.PutHiveStreaming.makeHiveWriter(PutHiveStreaming.java:964) 在org.apache.nifi.processors.hive.PutHiveStreaming.getOrCreateWriter(PutHiveStreaming.java:875) 在org.apache.nifi.processors.hive.PutHiveStreaming.lambda $ null $ 40(PutHiveStreaming.java:676) 在org.apache.nifi.processor.util.pattern.ExceptionHandler.execute(ExceptionHandler.java:127) 在org.apache.nifi.processors.hive.PutHiveStreaming.lambda $ onTrigger $ 44(PutHiveStreaming.java:673) at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2136) at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2106) 在org.apache.nifi.processors.hive.PutHiveStreaming.onTrigger(PutHiveStreaming.java:627) 在org.apache.nifi.processors.hive.PutHiveStreaming.lambda $ onTrigger $ 36(PutHiveStreaming.java:551) 在org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:114) 在org.apache.nifi.processor.util.pattern.RollbackOnFailure.onTrigger(RollbackOnFailure.java:184) 在org.apache.nifi.processors.hive.PutHiveStreaming.onTrigger(PutHiveStreaming.java:551) 在org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1120) 在org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:147) 在org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) 在org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent $ 1.run(TimerDrivenSchedulingAgent.java:132) at java.util.concurrent.Executors $ RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor $ ScheduledFutureTask.access $ 301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor $ ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) 在java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor $ Worker.run(ThreadPoolExecutor.java:624) 在java.lang.Thread.run(Thread.java:748) 2017-09-07 06:41:31,015 DEBUG [Timer-4] o.a.n.processors.hive.PutHiveStreaming PutHiveStreaming [id = 13ed53d2-015e-1000-c7b1-5af434c38751]开始向所有作家发送心跳 2017-09-07 06:41:31,890 INFO [计时器驱动的进程线程-3] hive.metastore尝试使用URI thrift连接到Metastore:// localhost:9083 2017-09-07 06:41:31,893 INFO [计时器驱动的进程线程-3] hive.metastore连接到Metastore。 2017-09-07 06:41:31,911 ERROR [计时器驱动的进程线程-3] oanprocessors.hive.PutHiveStreaming PutHiveStreaming [id = 13ed53d2-015e-1000-c7b1-5af434c38751]无法为端点创建HiveWriter:{metaStoreUri = ' thrift:// localhost:9083',database ='默认',table =' guys',partitionVals = [dev]}:org.apache.nifi。 util.hive.HiveWriter $ ConnectFailure:连接到EndPoint失败{metaStoreUri =' thrift:// localhost:9083',database ='默认',table =' guys' ;,partitionVals = [dev]} org.apache.nifi.util.hive.HiveWriter $ ConnectFailure:连接到EndPoint失败{metaStoreUri =' thrift:// localhost:9083',database ='默认',table =& #39; guys',partitionVals = [dev]} 在org.apache.nifi.util.hive.HiveWriter。(HiveWriter.java:79) 在org.apache.nifi.util.hive.HiveUtils.makeHiveWriter(HiveUtils.java:46) 在org.apache.nifi.processors.hive.PutHiveStreaming.makeHiveWriter(PutHiveStreaming.java:964) 在org.apache.nifi.processors.hive.PutHiveStreaming.getOrCreateWriter(PutHiveStreaming.java:875) 在org.apache.nifi.processors.hive.PutHiveStreaming.lambda $ null $ 40(PutHiveStreaming.java:676) 在org.apache.nifi.processor.util.pattern.ExceptionHandler.execute(ExceptionHandler.java:127) 在org.apache.nifi.processors.hive.PutHiveStreaming.lambda $ onTrigger $ 44(PutHiveStreaming.java:673) at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2136) at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2106) 在org.apache.nifi.processors.hive.PutHiveStreaming.onTrigger(PutHiveStreaming.java:627) 在org.apache.nifi.processors.hive.PutHiveStreaming.lambda $ onTrigger $ 36(PutHiveStreaming.java:551) 在org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:114) 在org.apache.nifi.processor.util.pattern.RollbackOnFailure.onTrigger(RollbackOnFailure.java:184) 在org.apache.nifi.processors.hive.PutHiveStreaming.onTrigger(PutHiveStreaming.java:551) 在org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1120) 在org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:147) 在org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) 在org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent $ 1.run(TimerDrivenSchedulingAgent.java:132) at java.util.concurrent.Executors $ RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor $ ScheduledFutureTask.access $ 301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor $ ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) 在java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor $ Worker.run(ThreadPoolExecutor.java:624) 在java.lang.Thread.run(Thread.java:748) 引起:org.apache.nifi.util.hive.HiveWriter $ TxnBatchFailure:从EndPoint获取事务批处理失败:{metaStoreUri =' thrift:// localhost:9083',database ='默认&# 39;,table =' guys',partitionVals = [dev]} 在org.apache.nifi.util.hive.HiveWriter.nextTxnBatch(HiveWriter.java:264) 在org.apache.nifi.util.hive.HiveWriter。(HiveWriter.java:73) ...省略了24个常用帧 引起:org.apache.hive.hcatalog.streaming.TransactionError:无法获取{metaStoreUri =' thrift:// localhost:9083',database ='默认',表=' guys',partitionVals = [dev]} 在org.apache.hive.hcatalog.streaming.HiveEndPoint $ TransactionBatchImpl.beginNextTransactionImpl(HiveEndPoint.java:578) 在org.apache.hive.hcatalog.streaming.HiveEndPoint $ TransactionBatchImpl.beginNextTransaction(HiveEndPoint.java:547) 在org.apache.nifi.util.hive.HiveWriter.nextTxnBatch(HiveWriter.java:261) ...省略了25个常用帧 引起:org.apache.thrift.transport.TTransportException:null 在org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) 在org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) 在org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) 在org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) 在org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) 在org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) 在org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore $ Client.recv_lock(ThriftHiveMetastore.java:3906) 在org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore $ Client.lock(ThriftHiveMetastore.java:3893) 在org.apache.hadoop.hive.metastore.HiveMetaStoreClient.lock(HiveMetaStoreClient.java:1863) at sun.reflect.GeneratedMethodAccessor380.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) 在org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:152) 在com.sun.proxy。$ Proxy126.lock(未知来源) 在org.apache.hive.hcatalog.streaming.HiveEndPoint $ TransactionBatchImpl.beginNextTransactionImpl(HiveEndPoint.java:573) ...省略了27个常见帧 2017-09-07 06:41:31,911 ERROR [计时器驱动的进程线程-3] oanprocessors.hive.PutHiveStreaming PutHiveStreaming [id = 13ed53d2-015e-1000-c7b1-5af434c38751]连接到Hive端点时出错:节俭表工作者://本地主机:9083 2017-09-07 06:41:31,911 DEBUG [计时器驱动的进程线程-3] o.a.n.processors.hive.PutHiveStreaming PutHiveStreaming [id = 13ed53d2-015e-1000-c7b1-5af434c38751]已选择产生其资源;将不会被安排再次运行1000毫秒 2017-09-07 06:41:31,912 ERROR [定时器驱动的进程线程-3] oanprocessors.hive.PutHiveStreaming PutHiveStreaming [id = 13ed53d2-015e-1000-c7b1-5af434c38751] Hive Streaming连接/写错误,流文件将受到处罚并被重审。 org.apache.nifi.util.hive.HiveWriter $ ConnectFailure:连接到EndPoint失败{metaStoreUri =' thrift:// localhost:9083',database ='默认',table =& #39; guys',partitionVals =
Hive表
CREATE TABLE
mydb.guys
(firstname
字符串,lastname
字符串) 分居(job
字符串) 聚集的( 名字) INTO 10 BUCKETS 行格式SERDER ' org.apache.hadoop.hive.ql.io.orc.OrcSerde' 存储为ORC 地点 ' HDFS://本地主机:9000 /用户/ papesdiop /人' TBLPROPERTIES(' transactional' =' true')
提前致谢
答案 0 :(得分:0)
如果在写入HDFS期间失败,那么您的用户可能没有权限写入目标目录?如果您有完整堆栈跟踪中的更多信息,请将其添加到您的问题中,因为它有助于诊断问题。我刚刚遇到这个问题时,是因为我的NiFi用户需要在目标操作系统上创建并添加到相应的HDFS组,以获得PutHiveStreaming写入ORC文件的权限在HDFS中。