在进行多次插入时,Asynchbase会抛出大量的zookeeper超时。最终它耗尽了内存。为什么?

时间:2014-06-10 12:15:08

标签: hbase

我试图通过运行一个简单的“尽可能快地插入一些随机数据”测试来测试Asynchbase的性能。测试代码大致如下:

org.hbase.async.HBaseClient client = new org.hbase.async.HBaseClient(zookeeperServer);
client.setFlushInterval((short)250);
Random rand = new Random();
long stoptime, elapsedTime;
double elapsedSeconds;
int counter = 0;
long absolutestart = System.nanoTime();
long starttime = absolutestart;

for (int xx = 1; xx <= numberOfRows; xx++) {
    byte[] rowKey = new byte[20];
    rand.nextBytes(rowKey);
    byte[] data = new byte[50];
    rand.nextBytes(data);
    counter += 1;
    PutRequest put = new PutRequest(Runner.TABLENAME.getBytes(), rowKey, "data".getBytes(), "col".getBytes(), data);
    client.put(put);

    //Flush the puts if we've hit the batchsize
    if (xx % batchSize == 0) {
        client.flush();
        stoptime = System.nanoTime();
        elapsedTime = (stoptime - starttime);
        elapsedSeconds = (double) elapsedTime / 1000000000.0;
        System.out.println(String.format("%d completed in %.6f s, %d ns, %.6f /s", counter, elapsedSeconds, elapsedTime, (batchSize / elapsedSeconds)));
        counter = 0;
        starttime = System.nanoTime();
    }
}
client.shutdown();
stoptime = System.nanoTime();
System.out.println(String.format("%d inserts in in %.6f s, %d ns, %.6f msg/s", numberOfRows, ((double) (stoptime - absolutestart)) / 1000000000.0, (stoptime - absolutestart), (numberOfRows / (((double) (stoptime - absolutestart)) / 1000000000.0))));

我有一个类似的代码库来测试默认的HBase和Thrift API。这些测试工作得很好,但是因为Asynchbase代码遇到了与zookeeper会话超时的问题。最终,插入停止,并抛出OutOfMemoryError。我将要插入的行数设置为10m,批量大小设置为10k。

10000 completed in 0.006476 s, 6476111 ns, 1544136.596794 /s
14/06/10 12:02:09 INFO zookeeper.ClientCnxn: Client session timed out, have not heard from server in 3729ms for sessionid 0x34656836aec21e8, closing socket connection and attempting reconnect
10000 completed in 2.850325 s, 2850325467 ns, 3508.371278 /s
10000 completed in 0.006344 s, 6343748 ns, 1576355.176782 /s
10000 completed in 0.006112 s, 6111646 ns, 1636220.422452 /s
10000 completed in 0.006011 s, 6010528 ns, 1663747.344659 /s
10000 completed in 0.006280 s, 6280169 ns, 1592313.837414 /s
14/06/10 12:02:10 INFO zookeeper.ZooKeeper: Session: 0x34656836aec21e8 closed
14/06/10 12:02:10 WARN async.HBaseClient: No longer connected to ZooKeeper, event=WatchedEvent state:Disconnected type:None path:null
14/06/10 12:02:10 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=trth-hadoop01,trth-hadoop02,trth-hadoop03 sessionTimeout=5000 watcher=org.hbase.async.HBaseClient$ZKClient@220b1039
14/06/10 12:02:10 INFO zookeeper.ClientCnxn: Opening socket connection to server zookeeperhost3/10.52.136.91:2181. Will not attempt to authenticate using SASL (unknown error)
14/06/10 12:02:10 INFO zookeeper.ClientCnxn: EventThread shut down
14/06/10 12:02:11 INFO zookeeper.ClientCnxn: Socket connection established to zookeeperhost3/10.52.136.91:2181, initiating session
10000 completed in 0.972888 s, 972887644 ns, 10278.679210 /s
14/06/10 12:02:11 INFO zookeeper.ClientCnxn: Session establishment complete on server zookeeperhost3/10.52.136.91:2181, sessionid = 0x34656836aec21e9, negotiated timeout = 5000
14/06/10 12:02:11 INFO async.HBaseClient: Connecting to .META. region @ 10.52.136.101:60020
14/06/10 12:02:11 ERROR zookeeper.ClientCnxn: Caught unexpected throwable
java.lang.NoClassDefFoundError: Could not initialize class org.hbase.async.RegionClient
        at org.hbase.async.HBaseClient$RegionClientPipeline.init(HBaseClient.java:2630)
        at org.hbase.async.HBaseClient.newClient(HBaseClient.java:2579)
        at org.hbase.async.HBaseClient.access$2700(HBaseClient.java:179)
        at org.hbase.async.HBaseClient$ZKClient$ZKCallback.handleMetaZnode(HBaseClient.java:3276)
        at org.hbase.async.HBaseClient$ZKClient$ZKCallback.processResult(HBaseClient.java:3132)
        at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:561)
        at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
10000 completed in 0.007104 s, 7103562 ns, 1407744.452713 /s
10000 completed in 0.006117 s, 6117418 ns, 1634676.590679 /s
[... more inserts ...]
10000 completed in 2.284842 s, 2284841896 ns, 4376.670446 /s
10000 completed in 2.300403 s, 2300402665 ns, 4347.065039 /s
14/06/10 12:02:31 INFO zookeeper.ClientCnxn: Client session timed out, have not heard from server in 4931ms for sessionid 0x34656836aec21e9, closing socket connection and attempting reconnect
14/06/10 12:02:39 INFO zookeeper.ClientCnxn: Opening socket connection to server zookeeperhost2/10.52.136.90:2181. Will not attempt to authenticate using SASL (unknown error)
14/06/10 12:02:55 INFO zookeeper.ClientCnxn: Socket connection established to zookeeperhost2/10.52.136.90:2181, initiating session
14/06/10 12:03:13 INFO zookeeper.ZooKeeper: Session: 0x34656836aec21e9 closed
14/06/10 12:03:16 WARN async.HBaseClient: No longer connected to ZooKeeper, event=WatchedEvent state:Disconnected type:None path:null
14/06/10 12:03:18 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=zookeeperhost1,zookeeperhost2,zookeeperhost3 sessionTimeout=5000 watcher=org.hbase.async.HBaseClient$ZKClient@220b1039
14/06/10 12:03:29 INFO zookeeper.ClientCnxn: EventThread shut down
14/06/10 12:04:00 WARN util.ShutdownHookManager: ShutdownHook '' failed, java.lang.OutOfMemoryError: GC overhead limit exceeded
java.lang.OutOfMemoryError: GC overhead limit exceeded
        at java.io.UnixFileSystem.resolve(UnixFileSystem.java:108)
        at java.io.File.<init>(File.java:236)
        at java.io.File.listFiles(File.java:1138)
        at org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:218)
        at org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:140)
        at org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:237)
        at org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:140)
        at org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:237)
        at org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:140)
        at org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:237)
        at org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:140)
        at org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:237)
        at org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:140)
        at org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:237)
        at org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:140)
        at org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:111)
        at org.apache.hadoop.util.RunJar$1.run(RunJar.java:183)
        at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)

这里发生了什么?我是否在代码库中的某处做了一个错误的假设 - asynchbase不适合这种类型的使用模式吗?或者某处有错误/错误?

1 个答案:

答案 0 :(得分:0)

我在发布赏金后大约10分钟内就找到了答案。

问题在于我如何调用该应用程序。我正在运行一大堆测试,一些mapreduce,一些纯hbase,并运行这样的jar:

HADOOP_CLASSPATH=`hbase classpath` HADOOP_MAPRED_HOME=/usr/lib/hadoop/ yarn jar myjar.jar

显然,对于纯hbase测试,您不需要MAPRED_HOME,也不需要通过纱线运行它。如果我只是使用

运行它
java -cp `hbase classpath` -jar myjar.jar

我没有遇到任何问题。