如何在Hive中映射没有限定符的HBase列?

时间:2014-10-10 13:53:04

标签: hive hbase

我想将我的HBase表映射到Hive,这就是我得到的:

CREATE EXTERNAL TABLE kutschke.bda01.twitter (
rowkey BIGINT,
userId BIGINT,
text STRING,
creationTime STRING,
isRetweet BOOLEAN,
retweetId BIGINT
)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ('hbase.columns.mapping' = ':key, user:id, text:, time:createdAt, retweet:isRetweet, retweet:retweetId'
TBLPROPERTIES('hbase.table.name' = 'kutschke.bda01.twitter'

然而,'文字:'列没有正确映射,因为它没有限定符。相反,我得到了例外:

Logging initialized using configuration in file:/etc/hive/conf.dist/hive-log4j.properties
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: 
MetaException(message:org.apache.hadoop.hive.serde2.SerDeException org.apache.hadoop.hive.hbase.HBaseSerDe: 
hbase column family 'text' should be mapped to Map<? extends LazyPrimitive<?, ?>,?>, 
that is the Key for the map should be of primitive type, but is mapped to string)

我想我理解将整个列族映射到Map背后的逻辑,但是有没有办法用空的限定符正确映射列?如果没有,我如何将列族映射到MAP,以及如何检索我真正想要的列?

1 个答案:

答案 0 :(得分:0)

这可以通过将Hive列键入Hive本机地图类型来完成,如下所示:

CREATE TABLE hbase_table_1(value map<string,int>, row_key int) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (
"hbase.columns.mapping" = "cf:,:key"
);

映射到整个CF的字段的输出将显示为json字符串。

此处有更多信息:https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration#HBaseIntegration-HiveMAPtoHBaseColumnFamily