Question

我的印象是，如果将rowkey传递给where子句，在HBase表的ontop上创建的hive表将放弃MapReduce并完全利用HBase索引。但是，我的查询正在使用MapReduce，因此非常慢。

我创建了一个HBase表，如下所示：

create '/app/SubscriptionBillingPlatform/Shishir','cf'

我为HBase表创建了两个外部Hive表，一个用于A，B，C列，另一个用于D，E，F列：

enter image description here

我将3个值放入hbase shell的表中，使用'abc'作为rowkey：

put '/app/SubscriptionBillingPlatform/Shishir','abc','cf:a','a'
put '/app/SubscriptionBillingPlatform/Shishir','abc','cf:b','b'
put '/app/SubscriptionBillingPlatform/Shishir','abc','cf:c','c'

回到hive shell，我然后启动我的查询：

select * from shishir1 where key='abc';

enter image description here

我期待这几乎和查询HBase内部的数据一样快。但是，Hive-HBase集成使用Map Reduce而不是HBase索引。我有没有办法让Hive-HBase集成放弃Mapreduce并完全利用HBase索引，还是我误解了这种可能性？

使用MapReduce代替HBase索引进行HBase集成？

0 个答案: