Apache Phoenix查询花费很长时间

时间:2019-10-28 18:13:19

标签: sql phoenix

嗨,我正在使用apache phoenix通过hbase查询sql。

表架构

CREATE TABLE TABLE_1 (
      SF_ID VARCHAR NOT NULL,
      ENTITY_ID VARCHAR NOT NULL,
      PRODUCT_SKU VARCHAR,
      CITY_NAME VARCHAR,
      SCREEN_NAME VARCHAR, 
      PRODUCT_LIST_VIEWS BIGINT,
      PRODUCT_LIST_CLICKS BIGINT,
      PRODUCT_LIST_CTR FLOAT,
      TIMESTAMP BIGINT NOT NULL,
      POS INTEGER NOT NULL,
      CONSTRAINT pk PRIMARY KEY (SF_ID, ENTITY_ID, TIMESTAMP, POS));

我已创建二级索引,如下所示:-

CREATE INDEX GA_2 ON TABLE_1 (ENTITY_ID) INCLUDE (PRODUCT_LIST_VIEWS, PRODUCT_LIST_CLICKS, PRODUCT_LIST_CTR);

但是,在50万行上运行时,以下查询大约需要1.5s到2s。

select ENTITY_ID as "entityId", sum(PRODUCT_LIST_VIEWS) as "productViewSum", sum(PRODUCT_LIST_CLICKS) as "productClickSum", sum(PRODUCT_LIST_CTR) as "productCTRSum" from "TABLE_1" group by ENTITY_ID;

说明计划如下:-

CLIENT 1-CHUNK 0 ROWS 0 BYTES PARALLEL 1-WAY FULL SCAN OVER GA_2 
SERVER AGGREGATE INTO ORDERED DISTINCT ROWS BY ["ENTITY_ID"]

有什么方法可以改善查询的响应时间?

/ ******************************************** / < / p>

更新:- 按照解释计划,我创建了一个带有12个桶的盐渍表。

现在说明计划如下:-

+-------------------------------------------------------------------+----------+
|                               PLAN                                | EST_BYTE |
+-------------------------------------------------------------------+----------+
| CLIENT 12-CHUNK PARALLEL 12-WAY FULL SCAN OVER GA_3               | null     |
|     SERVER AGGREGATE INTO ORDERED DISTINCT ROWS BY ["ENTITY_ID"]  | null     |
| CLIENT MERGE SORT                                                 | null     |
+-------------------------------------------------------------------+----------+

但是响应时间仍然相同。

再观察一件事:-

如果我在查询中不使用sum,响应速度将非常快。

例如

select ENTITY_ID, SUM(PRODUCT_LIST_VIEWS) from GA_TABLE_2 where SF_ID = '1' group by ENTITY_ID;

此查询耗时631毫秒

但是

select ENTITY_ID from GA_TABLE_2 where SF_ID = '1' group by ENTITY_ID;

这只花了30毫秒。

0 个答案:

没有答案