如何为hbase创建外部表

时间:2012-04-05 13:37:08

标签: hbase hive

我能够在HBase的hive中创建外部表,现在我需要创建一个具有可变列的外部表,这意味着HBase中的列不是针对特定表固定的,没有列和可以在数据插入时动态创建,处理这种情况的方法应该是什么。

总结:当HBase表中没有修复列数时,如何在配置单元中创建外部表。

提前致谢。

2 个答案:

答案 0 :(得分:5)

  1. 在Hbase shell中创建表

    create 'hbase_2_hive_names', 'id', 'name', 'age'

  2. 将数据加载到Hbase(输入文件必须是HDFS)

    export HADOOP_CLASSPATH=$(/usr/local/hbase/bin/hbase classpath);$HADOOP_HOME/bin/hadoop jar /usr/local/hbase/hbase-0.94.1.jar importtsv -Dimporttsv.columns=HBASE_ROW_KEY,id:id,name:fn,name:ln,age:age hbase_2_hive_names /var/data/samples/names.tsv

  3. 在Hive shell中创建外部表

    CREATE EXTERNAL TABLE hbase_hive_names(hbid INT, id INT, fn STRING, ln STRING, age INT) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,id:id,name:fn,name:ln,age:age") TBLPROPERTIES("hbase.table.name" = "hbase_2_hive_names");

答案 1 :(得分:0)

第一步:登录到HBase Shell

hbase shell

第二步:创建HBase表

hbase(main):001:0> create 'hbase_emp_table', [{NAME => 'per', COMPRESSION => 'SNAPPY'}, {NAME => 'prof', COMPRESSION => 'SNAPPY'} ]
Created table hbase_emp_table
Took 1.5417 seconds
=> Hbase::Table - hbase_emp_table

第3步:描述HBase表:

hbase(main):002:0> describe 'hbase_emp_table'
Table hbase_emp_table is ENABLED
hbase_emp_table
COLUMN FAMILIES DESCRIPTION
{NAME => 'per', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING =>
 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false',
PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'SNAPPY', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}
{NAME => 'prof', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING =
> 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false',
 PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'SNAPPY', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}
2 row(s)
Took 0.1846 seconds

第4步:将数据插入HBase表

put 'hbase_emp_table','1','per:name','Ranga Reddy'
put 'hbase_emp_table','1','per:age','32'
put 'hbase_emp_table','1','prof:des','Senior Software Engineer'
put 'hbase_emp_table','1','prof:sal','50000'

put 'hbase_emp_table','2','per:name','Nishanth Reddy'
put 'hbase_emp_table','2','per:age','3'
put 'hbase_emp_table','2','prof:des','Software Engineer'
put 'hbase_emp_table','2','prof:sal','80000'

第5步:检查HBase表数据

hbase(main):012:0> scan 'hbase_emp_table'
ROW                                             COLUMN+CELL
 1                                              column=per:age, timestamp=1606304606241, value=32
 1                                              column=per:name, timestamp=1606304606204, value=Ranga Reddy
 1                                              column=prof:des, timestamp=1606304606269, value=Senior Software Engineer
 1                                              column=prof:sal, timestamp=1606304606301, value=50000
 2                                              column=per:age, timestamp=1606304606362, value=3
 2                                              column=per:name, timestamp=1606304606338, value=Nishanth Reddy
 2                                              column=prof:des, timestamp=1606304606387, value=Software Engineer
 2                                              column=prof:sal, timestamp=1606304608374, value=80000
2 row(s)
Took 0.0513 seconds

Step6:使用蜂巢或蜂线登录到蜂巢外壳

hive

Step7:创建配置单元表

CREATE EXTERNAL TABLE IF NOT EXISTS hive_emp_table(id INT, name STRING, age SMALLINT, designation STRING, salary BIGINT) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,per:name,per:age,prof:des,prof:sal") 
TBLPROPERTIES("hbase.table.name" = "hbase_emp_table");

第8步:选择配置单元表数据

hive> select * from hive_emp_table;
INFO  : OK
+--------------------+----------------------+---------------------+-----------------------------+------------------------+
| hive_emp_table.id  | hive_emp_table.name  | hive_emp_table.age  | hive_emp_table.designation  | hive_emp_table.salary  |
+--------------------+----------------------+---------------------+-----------------------------+------------------------+
| 1                  | Ranga Reddy          | 32                  | Senior Software Engineer    | 50000                  |
| 2                  | Nishanth Reddy       | 3                   | Software Engineer           | 80000                  |
+--------------------+----------------------+---------------------+-----------------------------+------------------------+
2 rows selected (17.401 seconds)