Vertica:提高加入外部表的查询性能

时间:2018-01-12 22:07:36

标签: sql database performance parquet vertica

我正在对一个过夜的Vertica集群运行查询。我想知道我可以做些什么来提高性能。我的查询包含一个小表(21k行),其中ID列表连接到匹配字段(pat_id)上的大表(> 140亿行)。我想返回大表中所有在我的小参考表中都有ID的行。大表存储为外部镶木地板文件。任何建议都会非常感激。

解释声明如下:

explain select ims.* from juv_arthritis_pts juv join dwdev1_data.IMS_claims ims on juv.pat_id = ims.pat_id

Access Path:
+-JOIN HASH [Cost: 2K, Rows: 21K (NO STATISTICS)] (PATH ID: 1) Outer (RESEGMENT)(LOCAL ROUND ROBIN)
|  Join Cond: (juv.pat_id = ims.pat_id)
|  Execute on: v_dwp1_node0001, v_dwp1_node0002, v_dwp1_node0003, v_dwp1_node0005
| +-- Outer -> LOAD  EXTERNAL TABLE [Cost: 0, Rows: 10K (NO STATISTICS)] (PATH ID: 2)
| |      Table: IMS_claims
| |      copy from '/mapr/mapr.XXX.local/Environments/svc.dwdev1/data/ims_claims.final/*/*' PARQUET
| |      Execute on: Query Initiator
| +-- Inner -> STORAGE ACCESS for juv [Cost: 33, Rows: 21K] (PATH ID: 3)
| |      Projection: X.juv_arthritis_pts_b0
| |      Materialize: X.pat_id
| |      Execute on: v_dwp1_node0001, v_dwp1_node0002, v_dwp1_node0003
| +---> STORAGE ACCESS for juv (REPLACEMENT FOR DOWN NODE) [Cost: 49, Rows: 21K]
| |      Projection: X.juv_arthritis_pts_b1
| |      Materialize: juv.pat_id
| |      Execute on: v_dwp1_node0001, v_dwp1_node0005

0 个答案:

没有答案