HBase协处理器:checkAndPut导致<timed out =“”on =“”getting =“”lock =“”for =“”row =“”>

时间:2015-07-03 00:12:48

标签: hbase

HBase版本:0.94.15-cdh4.7.0

我的设置非常简单:

  • ttt 包含数据
  • 计数器,带有计数器(增量字段)
  • prePut ttt 表的

当在 ttt 中插入/更新行时,协处理器会检查同一行的列 d:k 中是否存在值。 /> 如果没有值,协处理器会在计数器表中递增计数器,并通过 checkAndPut 方法将其分配给 d:k 列。

代码如下:

@Override
public void prePut(final ObserverContext<RegionCoprocessorEnvironment> observerContext,
                   final Put put, final WALEdit edit, final boolean writeToWAL) throws IOException  {
    HTable tableCounters = null;
    HTable tableTarget = null;
    try {
        Get existingEdwGet = new Get(put.getRow());
        existingEdwGet.addColumn("d".getBytes(), "k".getBytes());
        tableTarget = new HTable(
                this.configuration,
                observerContext.getEnvironment().getRegion().getTableDesc().getName());

        if (!tableTarget.exists(existingEdwGet)) {
            // increment the counter
            tableCounters = new HTable(this.configuration, "counters");
            long newEdwKey = tableCounters.incrementColumnValue("static_row".getBytes(), "counters".getBytes(), "k".getBytes(), 1);

            Put keySetter = new Put(put.getRow());
            keySetter.add("d".getBytes(), "k".getBytes(), Bytes.toBytes(newEdwKey));
            tableTarget.checkAndPut(put.getRow(), "d".getBytes(), "k".getBytes(), null, keySetter);
        }
    } finally {
        releaseCloseable(tableTarget);
        releaseCloseable(tableCounters);
    }
}

功利主义功能/变量:

  • releaseClosable - 简单.close() try/catch
  • this.configuration - 在协处理器启动期间获取的Hadoop配置

hbase shell

执行简单PUT时
for i in 0..10 do
    put 'ttt', "hrow-#{i}" , 'd:column', 'value'
end    

该地区报告死锁:

2015-07-02 23:58:30,297 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer (IPC Server handler 43 on 60020): 
java.io.IOException: Timed out on getting lock for row=hrow-1
    at org.apache.hadoop.hbase.regionserver.HRegion.internalObtainRowLock(HRegion.java:3588)
    at org.apache.hadoop.hbase.regionserver.HRegion.getLock(HRegion.java:3678)
    at org.apache.hadoop.hbase.regionserver.HRegion.getLock(HRegion.java:3662)
    at org.apache.hadoop.hbase.regionserver.HRegion.checkAndMutate(HRegion.java:2723)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.checkAndMutate(HRegionServer.java:2307)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.checkAndPut(HRegionServer.java:2345)
    at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:354)
    at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1434)

问题:

  • checkAndPut 是否允许从 prePut 协处理器执行?
  • 还可以做些什么来保证在并发环境中,多个并发工作者可以写入相同的 ttt 行, d:k 值仅被分配一次?

1 个答案:

答案 0 :(得分:0)

实际问题是一个无限循环,由 prePut 协处理器调用 .put .checkAndPut 调用 prePut 协处理器。

为了打破循环,我实施了以下方法:

  1. 标记添加到正在创建的 put
  2. 协处理器顶部的
  3. - 检查标记是否存在。如果,请删除标记并跳过协处理器。如果则这是一个新请求,不是此协处理器先前启动的;因此 - 继续流程
  4. public static final byte[] DIM_FAMILY = "d".getBytes();
    public static final byte[] COLUMN_KEY = "k".getBytes();
    public static final byte[] COLUMN_MARKER = "marker".getBytes();
    public static final byte[] VALUE_MARKER = "+".getBytes();
    
    public static final TableName TABLE_COUNTERS = TableName.valueOf("counters");
    public static final byte[] COUNTER_FAMILY = "c".getBytes();
    public static final byte[] COUNTER_ROWKEY = "rowkey_counter".getBytes();
    public static final byte[] COUNTER_KEY = "key_counter".getBytes();
    
    
    public void prePut(final ObserverContext<RegionCoprocessorEnvironment> observerContext,
                       final Put put, final WALEdit edit, final Durability durability) throws IOException {
        if (put.has(DIM_FAMILY, COLUMN_MARKER)) {
            removeColumnMutations(put, COLUMN_MARKER);
            return;  // return from the coprocessor; otherwise an infinite loop will occur
        }
    
        HRegion region = observerContext.getEnvironment().getRegion();
        Table tableCounters = null;
        Connection connectionCounters = null;
        try {
            // check whether the key column for the row is empty
            Get existingEdwGet = new Get(put.getRow());
            existingEdwGet.addColumn(DIM_FAMILY, COLUMN_KEY);
            List<Cell> existingEdwCells = region.get(existingEdwGet, false);
    
            // check if key value is empty.
            // if so - assign one immediately
            if (existingEdwCells.isEmpty()) {
                // increment the key_counter
                connectionCounters = ConnectionFactory.createConnection(configuration);
                tableCounters = connectionCounters.getTable(TABLE_COUNTERS);
                long newEdwKey = tableCounters.incrementColumnValue(COUNTER_ROWKEY, COUNTER_FAMILY, COUNTER_KEY, 1);
    
                // form PUT with the new key value and a marker, showing that this insert should not be discarded
                Put keySetter = new Put(put.getRow());
                keySetter.addColumn(DIM_FAMILY, COLUMN_KEY, Bytes.toBytes(newEdwKey));
                keySetter.addColumn(DIM_FAMILY, COLUMN_MARKER, VALUE_MARKER);
    
                // consider checkAndPut return value, and increment Sequence Hole Number if needed
                boolean isNew = region.checkAndMutate(keySetter.getRow(), DIM_FAMILY, COLUMN_KEY,
                        CompareFilter.CompareOp.EQUAL, new BinaryComparator(null), keySetter, true);
            }
        } finally {
            releaseCloseable(tableCounters);
            releaseCloseable(connectionCounters);
        }
    }
    

    注意:

    • 以上协处理器适合HBase 1.0 SDK
    • 不是打开与底层区域的连接,而是使用 RegionCoprocessorEnvironment 上下文中的HBase Region实例
    • 功利方法 removeColumnMutations 可以省略,其唯一目的是从 PUT
    • 中删除标记