Hbase java客户端批处理/放慢cdh 4.6

时间:2014-07-08 12:19:07

标签: java hadoop hbase cloudera-cdh

我正在使用HBase存储由CDH4管理的应用程序日志(目前为4.5),并且在升级到cdh 4.6(与4.7相同)之后插入非常慢。我发现客户端正在连接到regionserver并立即关闭连接(我没有遇到使用CDh 4.5的相同问题)

RegionServer日志:

13:46:08,428 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=ZK03:2181,ZK02:2181,ZK01:2181 sessionTimeout=60000 watcher=hconnection
13:46:08,429 INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: The identifier of this process is 19573@NODE01
13:46:08,429 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server ZK03/10.1.243.170:2181. Will not attempt to authenticate using SASL (java.lang.SecurityException: Unable to locate a login configuration)
13:46:08,429 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to ZK03/10.1.243.170:2181, initiating session
13:46:08,431 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server ZK03/10.1.243.170:2181, sessionid = 0x146a9fec35171f0, negotiated timeout = 60000
13:46:08,538 INFO org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: Closed zookeeper sessionid=0x146a9fec35171f0
13:46:08,540 INFO org.apache.zookeeper.ZooKeeper: Session: 0x146a9fec35171f0 closed
13:46:08,540 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down

客户端连接类:

private void initConnection(Configuration hConf) throws RuntimeException {
    try {
        //HConnectionManager.create(hConf);
        hConnection = HConnectionManager.createConnection(hConf);
    } catch (ZooKeeperConnectionException e) {
        logAndThrow("Failed to init connection " + e.getMessage());
    }
}

public Connection(Configuration hConf) {
    initConnection(hConf);
}

public void closeConnection() throws IOException {
    hConnection.close();
}

public HTableInterface getHTableInterface(String tableName) throws IOException {
    HTableInterface htable = hConnection.getTable(tableName);
    htable.setAutoFlush(false, true);
    htable.setWriteBufferSize(1024*1024*12);
    return htable;
}

导入:

Put put = new Put(rowKey.get(), tsWhole);
mainTableBuffer.add(put);
if(cfg_.maxBatchBufferSize <= mainTableBuffer.size()) {
    mainTableInterface_.batch(mainTableBuffer);
    mainTableBuffer.clear();
}

1 个答案:

答案 0 :(得分:0)

我似乎发现了这个问题。在创建二级索引时,它在协处理器中。 这是插入secondaryIndexTable

的实际代码
 public void postBatchMutate(ObserverContext<RegionCoprocessorEnvironment> c, MiniBatchOperationInProgress<Pair<Mutation, Integer>> miniBatchOp) throws IOException {

        HTableInterface searchTableInterface = c.getEnvironment().getTable(tableName);
        try {
            searchTableInterface.batch(mutationsBuffer);
        } catch (InterruptedException e) {
            logger.error("Caught exception while executing batch on table " + currSearchTName, e);
        } finally {
            searchTableInterface.close();
        }
}

问题似乎是使用环境连接进行插入。在启动时创建连接

hConnection = HConnectionManager.createConnection(hConf);

并在postBarchMutate中用于获取表格

HTableInterface htable = hConnection.getTable(tableName);

它现在有效,但仍然不知道为什么使用环境连接是错误的,为什么连接总是关闭