我有一个HBase(v0.94.7)表,其中包含一个列族,并且随着时间的推移会添加列。这些列被命名为它们创建的时间戳,因此除非我查询该行,否则我不知道它具有的所有列。
现在给定一行,我希望原子地删除此列族的所有现有列,并添加一组新的列和值。
所以我想过使用HBase的RowMutations之类的:
RowMutations mutations = new RowMutations(row);
//delete the column family
Delete delete = new Delete(row);
delete.deleteFamily(cf);
//add new columns
Put put = new Put(row);
put.add(cf, col1, v1);
put.add(cf, col2, v2);
//delete column family and add new columns to same family
mutations.add(delete);
mutations.add(put);
table.mutateRow(mutations);
但是这段代码最终做的只是删除列族,它不会添加新列。预期会出现这种情况吗?
如果是这样,那么我怎样才能实现原子用一组新列替换列族的所有列的目标?
以下是相同的测试用例:
import junit.framework.Assert;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.HColumnDescriptor;
import org.apache.hadoop.hbase.HTableDescriptor;
import org.apache.hadoop.hbase.TableExistsException;
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.util.Bytes;
import org.junit.Before;
import org.junit.BeforeClass;
import org.junit.Test;
import java.util.NavigableMap;
public class TestHBaseRowMutations {
static String tableName = "nnn";
static byte[] cf1 = Bytes.toBytes("cf1");
static byte[] row = Bytes.toBytes("r1");
static HTablePool hTablePool;
@BeforeClass
public static void beforeClass() throws Exception {
Configuration config = HBaseConfiguration.create();
hTablePool = new HTablePool(config, Integer.MAX_VALUE);
HBaseAdmin admin = new HBaseAdmin(config);
HTableDescriptor tableDescriptor = new HTableDescriptor(tableName);
tableDescriptor.addFamily(new HColumnDescriptor(cf1));
try {
admin.createTable(tableDescriptor);
} catch (TableExistsException ignored){}
}
@Before
public void before() throws Exception {
HTableInterface table = hTablePool.getTable(tableName);
try {
Delete delete = new Delete(row);
table.delete(delete);
System.out.println("deleted old row");
Put put = new Put(row);
put.add(cf1, Bytes.toBytes("c1"), Bytes.toBytes("v1"));
put.add(cf1, Bytes.toBytes("c11"), Bytes.toBytes("v11"));
table.put(put);
System.out.println("Created row with seed data");
} finally {
table.close();
}
}
@Test
public void testColumnFamilyDeleteRM() throws Exception {
HTableInterface table = hTablePool.getTable(tableName);
try {
RowMutations rm =new RowMutations(row);
//delete column family cf1
Delete delete = new Delete(row);
delete.deleteFamily(cf1);
rm.add(delete);
System.out.println("Added delete of cf1 column family to row mutation");
//add new columns to same column family cf1
Put put = new Put(row);
put.add(cf1, Bytes.toBytes("c1"), Bytes.toBytes("new_v1"));
put.add(cf1, Bytes.toBytes("c11"), Bytes.toBytes("new_v11"));
rm.add(put);
System.out.println("Added puts of cf1 column family to row mutation");
//atomic mutate the row
table.mutateRow(rm);
System.out.println("Mutated row");
//now read the column family cf1 back
Result result = table.get(new Get(row));
NavigableMap<byte[], byte[]> familyMap = result.getFamilyMap(cf1);
//column family cf1 should have 2 columns because of the Put above
//------Following assert fails as cf1 does not exist anymore, why does cf1 not exist anymore?-------
Assert.assertNotNull(familyMap);
Assert.assertEquals(2, familyMap.size());
} finally {
table.close();
}
}
}
答案 0 :(得分:5)
在HBase用户论坛上发布了相同的问题,结果证明这是HBase中的一个错误。
预期的行为是,如果RowMutation对某个列族/列/行有一个删除后跟一个Put-to column-family / column / row,那么Put也应该被尊重(但事实并非如此)目前)。
HBase用户组讨论: http://apache-hbase.679495.n3.nabble.com/Using-RowMutations-to-replace-all-columns-of-a-row-td4045247.html
HBase JIRA同样: https://issues.apache.org/jira/browse/HBASE-8626也提供补丁。
答案 1 :(得分:2)
最接近的是将Put上的时间戳设置为高于Delete:
long now = System.currentTimeMillis();
Delete delete = new Delete(row);
delete.deleteFamily(cf1, now);
Put put = new Put(row);
put.add(cf1, col1, now + 1);
RowMutations mutations = new RowMutations(row);
mutations.add(delete);
mutations.add(put);
table.mutateRow(mutations);
可悲的是,这确实意味着现在时间戳为get
&#39;该列系列中没有任何内容。 Source
答案 2 :(得分:0)
有一个要共享的场景,当我们尝试执行RowMutations列表时,每个列表可能包含有效的ROW1:CF1:Q1:V1的Put以及ROW1:CF2:Q1:V1的Delete作为hbase批量操作,如下所示错误
java.lang.RuntimeException:java.lang.UnsupportedOperationException: 多调用中没有RowMutations;在使用mutateRow org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:218) 在 org.apache.hadoop.hbase.client.AsyncProcess $ AsyncRequestFutureImpl $ SingleServerRequestRunnable.run(AsyncProcess.java:748) 在 java.util.concurrent.Executors $ RunnableAdapter.call(Executors.java:511) 在java.util.concurrent.FutureTask.run(FutureTask.java:266)在 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 在 java.util.concurrent.ThreadPoolExecutor $ Worker.run(ThreadPoolExecutor.java:624) 在java.lang.Thread.run(Thread.java:748)
为解决此问题,我们选择分别执行每个rowMutation。有任何建议,欢迎您。