我们有大量的蜂巢单元测试在hadoop minicluster中运行。问题是它们按顺序运行,每个构建需要大约一个小时才能完成。我们想通过使用与zookeeper负载均衡的多个hive server2并行化hive单元测试。
使用连接字符串“ jdbc:hive2:// localhost:20103 / default ”直接连接到hiveserver2实例时,它按预期工作。但是,当使用连接字符串“ jdbc:hive2:// localhost:22010 / default; serviceDiscoveryMode = zooKeeper; zooKeeperNamespace = hiveserver2 ”连接到zookeeper时,它会失败,并显示以下错误。
hadoop minicluster中的zookeeper是否能够进行负载平衡?
INFO: Connecting to : jdbc:hive2://localhost:22010/default;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
java.sql.SQLException: org.apache.hive.jdbc.ZooKeeperHiveClientException: Unable to read HiveServer2 configs from ZooKeeper
at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:135)
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:208)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:51)
at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
Caused by: org.apache.hive.jdbc.ZooKeeperHiveClientException: Unable to read HiveServer2 configs from ZooKeeper
at org.apache.hive.jdbc.ZooKeeperHiveClientHelper.configureConnParams(ZooKeeperHiveClientHelper.java:80)
at org.apache.hive.jdbc.Utils.configureConnParams(Utils.java:505)
at org.apache.hive.jdbc.Utils.parseURL(Utils.java:425)
at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:133)
... 29 more
Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /hiveserver2
at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1590)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:214)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:203)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl.pathInForeground(GetChildrenBuilderImpl.java:199)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:191)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:38)
at org.apache.hive.jdbc.ZooKeeperHiveClientHelper.configureConnParams(ZooKeeperHiveClientHelper.java:63)
... 32 more
使用的版本
<hive.version>1.2.1000.2.4.0.0-169</hive.version>
<hadoop.version>2.7.1.2.4.0.0-169</hadoop.version>
<minicluster.version>0.1.14</minicluster.version>
服务器配置
public HiveServerRunner() {
zookeeperLocalCluster = new ZookeeperLocalCluster.Builder()
.setPort(22010)
.setTempDir("embedded_zk")
.setZookeeperConnectionString("127.0.0.1:22010")
.setDeleteDataDirectoryOnClose(true)
.build();
hiveLocalMetaStore = new HiveLocalMetaStore.Builder()
.setHiveMetastoreHostname("localhost")
.setHiveMetastorePort(20102)
.setHiveMetastoreDerbyDbDir("metastore_db")
.setHiveScratchDir("hive_scratch_dir")
.setHiveWarehouseDir("warehouse_dir")
.setHiveConf(buildHiveConf())
.build();
hiveLocalServer2 = new HiveLocalServer2.Builder()
.setHiveServer2Hostname("localhost")
.setHiveServer2Port(20103)
.setHiveMetastoreHostname("localhost")
.setHiveMetastorePort(20102)
.setHiveMetastoreDerbyDbDir("metastore_db")
.setHiveScratchDir("hive_scratch_dir")
.setHiveWarehouseDir("warehouse_dir")
.setHiveConf(buildHiveConf())
.setZookeeperConnectionString("127.0.0.1:22010")
.build();
}
public static HiveConf buildHiveConf() {
HiveConf hiveConf = new HiveConf();
hiveConf.set("hive.txn.manager", "org.apache.hadoop.hive.ql.lockmgr.DbTxnManager");
hiveConf.set("hive.compactor.initiator.on", "true");
hiveConf.set("hive.compactor.worker.threads", "5");
hiveConf.set("hive.root.logger", "DEBUG,console");
hiveConf.set("hadoop.bin.path", System.getenv("HADOOP_HOME") + "/bin/hadoop");
hiveConf.set("hive.exec.submit.local.task.via.child", "false");
hiveConf.set("hive.server2.support.dynamic.service.discovery", "true");
hiveConf.set("hive.zookeeper.quorum", "127.0.0.1:22010");
hiveConf.setIntVar("hive.metastore.connect.retries", 3);
System.setProperty("HADOOP_HOME", WindowsLibsUtils.getHadoopHome());
return hiveConf;
}
答案 0 :(得分:0)
看起来Zookeeper确实在进行负载平衡,而是将客户端的请求定向到可用的 random HS2。
有关更多详细信息,请参见下面的链接