我已将hive,hdfs,sqoop功能集成到java&它的工作正常。但它以顺序方式执行。
所以现在我尝试实现多线程来提高性能。
这是我工作流程的示例模板。
1)主Java类 - >调用一个DAO类函数:(在这个方法中我写了几个hive opertation,比如join on table,group by等。然后将匹配的结果插入到antoher表中)
2)在那个DAO类中有很多步骤(将数据加载到hdfs,过滤掉记录并将其插入另一个表等)所以我想出了一些彼此独立的步骤。然后我将这些步骤放到4个不同的线程中,这样这个步骤就会并行执行。
3)在这4个线程中,我调用了其他类的一些函数,这些函数再次执行了一些不同的hive查询。所以基本上我平行地在Hive上执行一些操作。
所以这是我的问题
1)我们可以用多线程方式执行Hive查询吗?
2)如果是的话。那么需要在hive-site.xml文件中设置哪些参数?。
3)hive可以处理并发JDBC语句吗?
这是例外情况
java.sql.SQLException: org.apache.thrift.transport.TTransportException: Read a negative frame size (-2147418110)!
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:262)
at org.ecl.dao.HadoopDBOperationsDAO.insertNonUTIValueMatchTrades_intoTPCP(HadoopDBOperationsDAO.java:133)
at org.ecl.service.DTCCService_MultiThreading_Runnable.callHadoopServiceforLoadData(DTCCService_MultiThreading_Runnable.java:133)
at org.ecl.DTCCHiveClient.main(DTCCHiveClient.java:26)
Caused by: org.apache.thrift.transport.TTransportException: Read a negative frame size (-2147418110)!
at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:426)
at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:405)
at org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_ExecuteStatement(TCLIService.java:225)
at org.apache.hive.service.cli.thrift.TCLIService$Client.ExecuteStatement(TCLIService.java:212)
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:253)
... 3 more
java.sql.SQLException: org.apache.thrift.transport.TTransportException: Read a negative frame size (-2147418110)!
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:262)
at org.ecl.dao.DTCCPairingLogicDAO.fieldBreak(DTCCPairingLogicDAO.java:60)
at org.ecl.thread.FieldBreakThread.run(FieldBreakThread.java:41)
at java.lang.Thread.run(Unknown Source)
Caused by: org.apache.thrift.transport.TTransportException: Read a negative frame size (-2147418110)!
at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:426)
at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:405)
at org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_ExecuteStatement(TCLIService.java:225)
at org.apache.hive.service.cli.thrift.TCLIService$Client.ExecuteStatement(TCLIService.java:212)
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:253)
... 3 more
java.sql.SQLException: org.apache.thrift.TApplicationException: ExecuteStatement failed: out of sequence response
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:262)
at org.ecl.dao.HadoopDBOperationsDAO.recordswithMatchedUTIValue(HadoopDBOperationsDAO.java:149)
at org.ecl.thread.UTIValueMatchedThread.run(UTIValueMatchedThread.java:40)
at java.lang.Thread.run(Unknown Source)
Caused by: org.apache.thrift.TApplicationException: ExecuteStatement failed: out of sequence response
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:76)
at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_ExecuteStatement(TCLIService.java:225)
at org.apache.hive.service.cli.thrift.TCLIService$Client.ExecuteStatement(TCLIService.java:212)
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:253)
... 3 more
0 Trade records
Breaks identified in trade records matching on UTI
java.sql.SQLException: org.apache.thrift.transport.TTransportException: Cannot read. Remote side has closed. Tried to read 3072 bytes, but only got 175 bytes. (This is often indicative of an internal error on the server side. Please check your server logs.)
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:262)
at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:392)
at org.ecl.dao.DTCCOperationDAO.countRows(DTCCOperationDAO.java:45)
at org.ecl.thread.FieldBreakThread.run(FieldBreakThread.java:42)
at java.lang.Thread.run(Unknown Source)
Caused by: org.apache.thrift.transport.TTransportException: Cannot read. Remote side has closed. Tried to read 3072 bytes, but only got 175 bytes. (This is often indicative of an internal error on the server side. Please check your server logs.)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:354)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:215)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_ExecuteStatement(TCLIService.java:225)
at org.apache.hive.service.cli.thrift.TCLIService$Client.ExecuteStatement(TCLIService.java:212)
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:253)
... 4 more
}
帮助我解决这个问题。