我在Hbase中有一个表"aks:myprofiles
“
有两个列族i和s
列系列我有 - 5列{ic1,ic2,ic3,ic4,ic5}
列系列有 - 5列{sc1,sc2,sc3,sc4,sc5}
Describe "aks:myprofiles"
NAME => 'i', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', COMPRESSION => 'SNAPPY', VERSIONS => '1', MIN_VERSIONS => '0', TTL => 'FOREVER',
KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
{NAME => 's', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', COMPRESSION => 'SNAPPY', VERSIONS => '1', MIN_VERSIONS => '0', TTL => 'FOREVER',
KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
我想将表数据复制到所有版本的另一个表中 ic1,ic2和sc1,sc2到一个新表
不是所有列我都想要特定列的所有版本
答案 0 :(得分:2)
CopyTable
的方式。如果要自定义特定列的版本,可以通过扩展CopyTable
来创建自定义mapreduce程序,这与CopyTable
mapreduce作业不可行。如果您深入研究代码,您将了解几个选项。
请参阅CopyTable printusage
方法
hbase org.apache.hadoop.hbase.mapreduce.CopyTable --starttime=1265875194289 --endtime=1265878794289 --peer.adr=server1,server2,server3:2181:/hbase --families=myOldCf:myNewCf,cf2,cf3 TestTable
/*
* @param errorMsg Error message. Can be null.
*/
private static void printUsage(final String errorMsg) {
if (errorMsg != null && errorMsg.length() > 0) {
System.err.println("ERROR: " + errorMsg);
}
System.err.println("Usage: CopyTable [general options] [--starttime=X] [--endtime=Y] " +
"[--new.name=NEW] [--peer.adr=ADR] <tablename>");
System.err.println();
System.err.println("Options:");
System.err.println(" rs.class hbase.regionserver.class of the peer cluster");
System.err.println(" specify if different from current cluster");
System.err.println(" rs.impl hbase.regionserver.impl of the peer cluster");
System.err.println(" startrow the start row");
System.err.println(" stoprow the stop row");
System.err.println(" starttime beginning of the time range (unixtime in millis)");
System.err.println(" without endtime means from starttime to forever");
System.err.println(" endtime end of the time range. Ignored if no starttime specified.");
System.err.println(" versions number of cell versions to copy");
System.err.println(" new.name new table's name");
System.err.println(" peer.adr Address of the peer cluster given in the format");
System.err.println(" hbase.zookeeper.quorum:hbase.zookeeper.client"
+ ".port:zookeeper.znode.parent");
System.err.println(" families comma-separated list of families to copy");
System.err.println(" To copy from cf1 to cf2, give sourceCfName:destCfName. ");
System.err.println(" To keep the same name, just give \"cfName\"");
System.err.println(" all.cells also copy delete markers and deleted cells");
System.err.println(" bulkload Write input into HFiles and bulk load to the destination "
+ "table");
System.err.println();
System.err.println("Args:");
System.err.println(" tablename Name of the table to copy");
System.err.println();
System.err.println("Examples:");
System.err.println(" To copy 'TestTable' to a cluster that uses replication for a 1 hour window:");
System.err.println(" $ hbase " +
"org.apache.hadoop.hbase.mapreduce.CopyTable --starttime=1265875194289 --endtime=1265878794289 " +
"--peer.adr=server1,server2,server3:2181:/hbase --families=myOldCf:myNewCf,cf2,cf3 TestTable ");
System.err.println("For performance consider the following general option:\n"
+ " It is recommended that you set the following to >=100. A higher value uses more memory but\n"
+ " decreases the round trip time to the server and may increase performance.\n"
+ " -Dhbase.client.scanner.caching=100\n"
+ " The following should always be set to false, to prevent writing data twice, which may produce \n"
+ " inaccurate results.\n"
+ " -Dmapreduce.map.speculative=false");
}
答案 1 :(得分:1)
我们可以通过以下方式将特定列的所有版本(例如ic1, ic2, ic3
)从表格a
复制到表格b
:
hbase org.apache.hadoop.hbase.mapreduce.CopyTable --versions=vers --families=ic1,ic2 --new.name=b a
其中vers
是您需要复制的最大版本数。
对于所有其他列,可以排除--versions
选项,即可以运行以下命令
hbase org.apache.hadoop.hbase.mapreduce.CopyTable --new.name=b a
答案 2 :(得分:0)
您可以使用copyTable
$ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable --new.name=new_table_name myprofiles