我正在使用Hive Streaming Data Ingest API,详见https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest。为了能够看到我的版本号,我的Maven声明如下:
<dependency>
<groupId>org.apache.hive.hcatalog</groupId>
<artifactId>hive-hcatalog-streaming</artifactId>
<version>2.1.0</version>
</dependency>
<dependency>
<groupId>org.apache.hive.hcatalog</groupId>
<artifactId>hive-hcatalog-core</artifactId>
<version>2.1.0</version>
</dependency>
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>2.1.0</version>
</dependency>
我安装了HDP 2.5.0.0-1245。所有服务都在单个节点上运行以进行测试。
我声明我的连接如下
public HiveProcess(String dbname, String tablename, List<String> partitionVals)
{
hiveEP = new HiveEndPoint("thrift://" + HadoopProperties.HIVE_HOST + ":" + HadoopProperties.HIVE_THRIFT_PORT, dbname, tablename, partitionVals);
try {
//generate timestamp agent string to be identifiable in logs
String agent = "tablename-";
Date curdate = Calendar.getInstance().getTime();
SimpleDateFormat format = new SimpleDateFormat("yyyyMMddHHmmss");
agent = agent + format.format(curdate);
hiveStream = hiveEP.newConnection(true, agent);
} catch (ConnectionError|InvalidPartition|InvalidTable|PartitionCreationFailed|ImpersonationFailed|InterruptedException|UndeclaredThrowableException e) {
log.error("HiveProcess constructor: " + e.getMessage());
}
}
它成功创建连接,然后运行
public boolean processBatch(List<byte[]> toInsert) {
try {
StrictJsonWriter write = new StrictJsonWriter(hiveEP);
TransactionBatch txnBatch = hiveStream.fetchTransactionBatch(toInsert.size(), write);
//open the TransactionBatch
txnBatch.beginNextTransaction();
//loop through records for insertion
for(byte[] ins: toInsert) {
txnBatch.write(ins);
}
//commit batch
txnBatch.commit();
txnBatch.close();
//clean up connection, we are done until next time
hiveStream.close();
//if we made it to here, this is a success
return true;
} catch (StreamingException|InterruptedException|UndeclaredThrowableException e) {
log.error("HiveProcess processBatch " + e.getMessage());
//clean up connection, we are done until next time
hiveStream.close();
return false;
}
}
在HiveEndPoint源代码的这一行,它解决了错误 localSession = SessionState.start(new CliSessionState(conf));
输出日志如下所示
15:17:22.634 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://localhost:9083
15:17:22.657 [main] INFO hive.metastore - Opened a connection to metastore, current connections: 1
15:17:22.672 [main] INFO hive.metastore - Connected to metastore.
15:17:41.306 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://localhost:9083
15:17:41.311 [main] INFO hive.metastore - Opened a connection to metastore, current connections: 2
15:17:41.325 [main] INFO hive.metastore - Connected to metastore.
15:17:41.604 [main] WARN hive.ql.metadata.Hive - Failed to register all functions.
org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.thrift.TApplicationException: Invalid method name: 'get_all_functions'
at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3593) ~[hive-exec-2.1.0.jar:2.1.0]
at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:236) ~[hive-exec-2.1.0.jar:2.1.0]
at org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:221) [hive-exec-2.1.0.jar:2.1.0]
at org.apache.hadoop.hive.ql.metadata.Hive.<init>(Hive.java:366) [hive-exec-2.1.0.jar:2.1.0]
at org.apache.hadoop.hive.ql.metadata.Hive.create(Hive.java:310) [hive-exec-2.1.0.jar:2.1.0]
at org.apache.hadoop.hive.ql.metadata.Hive.getInternal(Hive.java:290) [hive-exec-2.1.0.jar:2.1.0]
at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:266) [hive-exec-2.1.0.jar:2.1.0]
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:545) [hive-exec-2.1.0.jar:2.1.0]
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:513) [hive-exec-2.1.0.jar:2.1.0]
at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.createPartitionIfNotExists(HiveEndPoint.java:445) [hive-hcatalog-streaming-2.1.0.jar:2.1.0]
任何人都可以帮我弄清楚这里出了什么问题以及我需要做些什么来纠正这个问题?
谢谢!