雪花加入性能改进

时间:2021-05-11 09:03:27

标签: join snowflake-cloud-data-platform sql-view

我需要在包含 1300 多列的表的顶部创建一个视图。每季度将新数据加载到表中(以百万为单位)。在创建视图时,我需要将其他表与基表连接起来。我还需要在视图中添加一个最近的行指示器。

CREATE OR REPLACE SECURE VIEW VIEW_NAME AS
SELECT lkp_tbl.col1,base_tbl.col1,base_tbl.col2,base_tbl.col3,........,base_tbl.col1334,1 as Is_Latest_Quarter FROM base_tbl full outer JOIN lkp_tbl
on base_tbl.CUST_ID = lkp_tbl.CUST_ID 
where snapshot_dt=(select max(snapshot_dt) from base_tbl)


union all


SELECT lkp_tbl.col1,base_tbl.col1,base_tbl.col2,base_tbl.col3,........,base_tbl.col1334,0 as Is_Latest_Quarter FROM base_tbl full outer JOIN lkp_tbl
on base_tbl.CUST_ID = lkp_tbl.CUST_ID 
where snapshot_dt!=(select max(snapshot_dt) from base_tbl);

创建此视图后,即使我们查询 100 行,查询的性能也太慢了。有没有办法以更有效的方式创建视图。如果不是,我该如何提高性能?

1 个答案:

答案 0 :(得分:1)

只用一个SELECT语句,用一个CASE语句计算Is_Latest_Quarter

更新(几乎)实际 SQL

CREATE OR REPLACE SECURE VIEW VIEW_NAME AS
SELECT {list of columns you want to include}
,CASE WHEN snapshot_dt=(select max(snapshot_dt) from base_tbl) THEN 1 
 ELSE 0 END as Is_Latest_Quarter
FROM base_tbl 
full outer JOIN lkp_tbl on base_tbl.CUST_ID = lkp_tbl.CUST_ID 

或者,如果 Snowflake 不喜欢该内联子查询,您可以使用 CTE 之类的:

CREATE OR REPLACE SECURE VIEW VIEW_NAME AS
    WITH MAX_DATE AS (SELECT MAX(Ssnapshot_dt) AS max_snapshot_dt FROM base_tbl),
    SELECT {list of columns you want to include}
    ,CASE WHEN max_date.max_snapshot_dt is not null  THEN 1 
     ELSE 0 END as Is_Latest_Quarter
    FROM base_tbl 
    full outer JOIN lkp_tbl on base_tbl.CUST_ID = lkp_tbl.CUST_ID
    LEFT OUTER JOIN MAX_DATE ON base_tbl.snapshot_dt = max_date.max_snapshot_dt