强制Redshift首先评估特定谓词

时间:2018-05-01 04:23:17

标签: sql amazon-redshift

我有一个小应用程序,它在每日计划中针对Redshift数据库执行一系列SQL脚本,并使用聚合数据填充表,以供客户提取。脚本保存在文本文件中,可以轻松更新,SQL从文件中提取出来' | businessday |'替换为所需日期,例如' 20180501'。没有强大的逻辑来基于当前日历日期的日期。

客户的要求已更改,现在有两个脚本只需要在该月的最后一天填充表格。我可以更新脚本,以便谓词读取:

WHERE (SELECT businessday FROM bd) = LAST_DAY((SELECT businessday FROM bd))

其中bd是CTE,所以我可以将日期字符串转换为DATE。

虽然这没有正确地返回任何记录,但它只需要稍微少的时间来执行,就像我在整个月运行它一样 - 它需要超过一分钟才能返回0行。我希望它能够快速将此谓词识别为失败,并且几乎不会立即返回任何行。

有没有办法重新构建SQL以首先评估此谓词?

我的理解是你不能在Redshift中使用过程IF语句,所以我只能在SQL字符串中添加谓词。

我尝试添加第二个CTE,它不会在关键表的businessday列上返回任何谓词:

WITH bd as (SELECT CAST('20180425' as date) as businessday WHERE 
    (SELECT CAST('20180425' as date)) = LAST_DAY(( CAST('20180425' as date)))
...
WHERE ts.businessday in (select businessday from bd)

(这需要修改以获得我需要的东西,但原理似乎不起作用)

简化的SQL字符串(删除了几个表和列):

with cte as (select storeid from ttl_store_processed where
        businessday = '20180425'),
    bd as (SELECT CAST('20180425' as date) as businessday 
        WHERE (SELECT CAST('20180425' as date)) = LAST_DAY(( CAST('20180425' as date))))
SELECT store.storenumber AS COST_CENTER,
     TO_CHAR(DATE(tii.BusinessDay), 'YYYYMM') AS YEAR_MONTH,
     ii.ItemCode AS MATERIAL_NUMBER,
     SUM(tii.Quantity) AS UNITS
FROM cte s
    inner join transactionsale ts 
        on s.storeid = ts.storeid
    inner join Store store 
        on ts.storeid = store.storeid
    inner join transactionsaleitem tsi 
        on ts.transactionsaleid = tsi.transactionsaleid
    inner join transactioninventoryitem tii 
        on tsi.transactionsaleitemid = tii.transactionsaleitemid
    inner join inventoryitem ii 
        on tii.inventoryitemid = ii.inventoryitemid
WHERE (SELECT businessday FROM bd) = LAST_DAY((SELECT businessday FROM bd))
    AND ts.storeid IN (SELECT storeid FROM cte)
    AND ts.businessday BETWEEN DATE_TRUNC('MONTH', (SELECT businessday FROM bd)) 
        AND LAST_DAY((SELECT businessday FROM bd))
GROUP BY 
     store.storenumber,
     TO_CHAR(DATE(tii.BusinessDay), 'YYYYMM'),
     ii.ItemCode;

cte目前返回~20家商店,但这将增加到可能180+。我试过应用逻辑,所以这个表是空的:

with cte as (select storeid from mcdonaldshk.ttl_store_processed 
    where businessday = '20180425' and (SELECT CAST('20180425' as date)) 
        = LAST_DAY(( CAST('20180425' as date))))

这似乎不起作用

1 个答案:

答案 0 :(得分:1)

所以,你基本上说当你(SELECT businessday FROM bd) = LAST_DAY((SELECT businessday FROM bd))为假时,你想让它真的快速运行,先让它进行评估吗?

您可以尝试将查询加入子查询:

JOIN (SELECT 'end of month'
      FROM bd
      WHERE businessday = LAST_DAY(businessday)
      ) lastday ON (true)

这样,如果它不是最后一天,则返回零行,因此没有要连接的行。如果首先评估此问题,则查询的其余部分将不会执行,因为没有要加入的行。

顺便说一句,您还可以简化一些代码:

WHERE (SELECT CAST('20180425' as date)) = LAST_DAY(( CAST('20180425' as date)))

可以简单地说:

WHERE ('20180425'::date) = LAST_DAY('20180425'::date)

另外,如果您向JOIN添加bd,则可以简化

ts.businessday BETWEEN 
    DATE_TRUNC('MONTH', (SELECT businessday FROM bd)) 
    AND LAST_DAY((SELECT businessday FROM bd))

成:

ts.businessday BETWEEN 
    DATE_TRUNC('MONTH', businessday) 
    AND LAST_DAY(businessday)