我想集成hive和Hbase并通过hive元数据从impala查询数据。
在HBase表中,只有一个列族,并且该列族中的某些列包含字符串类型,int类型和某些复杂类型,例如map和array。对于复杂类型,例如映射和数组,我只是在业务代码中生成映射和数组,然后将它们序列化为字节数组,然后将字节数组写入hbase。
当我创建外部配置单元表以相互关联hbase表时,如下所示:
CREATE EXTERNAL TABLE external_facttable_testmap (
key string,
user_id string,
event_time string,
device_preference_value map<string,int>,
device_preference array<string>,
active_day_value map<string,int>,
active_day array<string>,
active_time_value map<string,int>,
active_time array<string>,
daily_order_num bigint,
daily_mall_order_num bigint,
daily_promoted_order_num bigint,
daily_gmv bigint,
daily_promoted_value bigint,
rating_record_value string,
average_rating_record array<string>,
payment_method_preference_value map<string,int>,
payment_method_preference array<string>,
logistics_method_preference_value map<string,int>,
logistics_method_preference array<string>,
category_preference_value map<string,int>,
category_preference array<string>,
subcategory_preference_value map<string,int>,
subcategory_preference array<string>)
ROW FORMAT SERDE 'org.apache.hadoop.hive.hbase.HBaseSerDe'
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (
"hbase.columns.mapping" =
":key,default:user_id,default:event_time,default:device_preference_value,default:device_preference,default:active_day_value,default:active_day,
default:active_time_value,default:active_time,default:daily_order_num,default:daily_mall_order_num,default:daily_promoted_order_num,default:daily_gmv,
default:daily_promoted_value,default:rating_record_value,default:average_rating_record,default:payment_method_preference_value,
default:payment_method_preference,default:logistics_method_preference_value,default:logistics_method_preference,default:category_preference_value,
default:category_preference,default:subcategory_preference_value,default:subcategory_preference"
)
TBLPROPERTIES("hbase.table.name" = "facttable");
我对复杂类型一无所知,它可以支持复杂类型来处理hive和hbase集成吗?
谢谢。