Teradata SQL:如果满足条件,则计算运行总计

时间:2019-12-17 00:06:10

标签: sql teradata

我有一个包含以下列和数据的数据集:

Customer | Week_number | Amount
cust1    |  0          | 100
cust1    |  1          | 200
cust1    |  3          | 300
cust2    |  0          | 1000
cust2    |  1          | 2000

我需要计算每位客户每两周的总计。

借助窗口功能,我可以做到这一点:

SELECT 
 CUSTOMER, WEEK_NUMBER
, SUM(AMOUNT) OVER (PARTITION BY CUSTOMER ORDER BY WEEK_NUMBER ROWS 1 PRECEDING) AS FORTNIGHT_AMOUNT
FROM AMOUNT

但是,即使前一周没有金额,这也会加起来。在上面的示例中,对于cust1,第3行,它累加了第3周和第1周。只有在week_number比当前行的周少1的情况下,才应添加金额。这可能吗?感谢您的帮助。

我得到的是什么

Customer | Week_number | Fortnight_Amount
cust1    |  0          | 100
cust1    |  1          | 300
cust1    |  3          | **500**
cust2    |  0          | 1000
cust2    |  1          | 3000

预期结果:

Customer | Week_number | Fortnight_Amount
cust1    |  0          | 100
cust1    |  1          | 300
cust1    |  3          | **300**
cust2    |  0          | 1000
cust2    |  1          | 3000

4 个答案:

答案 0 :(得分:1)

如果只有两周/行,您的查询可以在Explain中进一步简化为一个STATS步骤(因为两个OLAP函数都应用相同的PARTITION / ORDER):

SELECT T.*
, CASE 
    WHEN MAX(WEEK_NUMBER) OVER (PARTITION BY CUSTOMER ORDER BY WEEK_NUMBER ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) + 1 = WEEK_NUMBER
    THEN SUM(AMOUNT)      OVER (PARTITION BY CUSTOMER ORDER BY WEEK_NUMBER ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
   ELSE AMOUNT
  END AS TWO_WEEK_SUM_AMOUNT
FROM MY_TABLE T
ORDER BY CUSTOMER, WEEK_NUMBER

当然,这是假设周从0开始,而没有上一年的第52/53周。

答案 1 :(得分:0)

如果您只想忽略不是立即连续的周数,则可以先使用lag(),然后使用窗口sum()

select
    customer,
    week_number,
    sum(
        case when lag_week_number is null or week_number = lag_week_number + 1 
            then amount
            else 0 
        end
    ) over(partition by customer order by week_number) fortnight_amount
from (
    select 
        t.*, 
        lag(week_number) over(partition by customer order by week_number) lag_week_number
    from mytable t
) t

实际上,当week_numbers中存在间隔时,您实际上可能想重置 sum。为此,这是某种差距和孤岛的分配,您将以不同的方式进行操作:想法是当两个连续的星期数ae连续出现时,进行累积sum来开始一个新的组,然后求和每组:

select 
    customer,
    week_number,
    sum(amount) over(partition by customer, grp order by week_date) fortnight_amount
from (
    select 
        t.*,
        sum(
            case 
                when lag_week_number is null or week_number = lag_week_number + 1 
                then 0
                else 1
            end
        ) grp
    from (
        select 
            t.*, 
            lag(week_number) over(partition by customer order by week_number) lag_week_number
        from mytable t
    ) t
) t

答案 2 :(得分:0)

您要range分区,而不是row分区:

SELECT CUSTOMER, WEEK_NUMBER,
       SUM(AMOUNT) OVER (PARTITION BY CUSTOMER
                         ORDER BY WEEK_NUMBER 
                         RANGE BETWEEN 1 PRECEDING AND CURRENT ROW
                        ) AS FORTNIGHT_AMOUNT
FROM AMOUNT;

答案 3 :(得分:0)

感谢@Gordon和@GMB的回答。不幸的是,我无法同时使用Teradata SQL中的LAG函数或RANGE分区。但是我能够使用你们俩描述的概念来获得以下答案。

SELECT 
CUSTOMER
, WEEK_NUMBER
, LAG_WEEK_NUMBER
, AMOUNT
, CASE 
  WHEN WEEK_NUMBER = LAG_WEEK_NUMBER + 1 
  THEN SUM(AMOUNT) OVER (PARTITION BY CUSTOMER ORDER BY WEEK_NUMBER ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
  ELSE AMOUNT
END AS TWO_WEEK_SUM_AMOUNT
FROM (
  SELECT 
  T.*
  , MAX(WEEK_NUMBER) OVER (PARTITION BY CUSTOMER ORDER BY WEEK_NUMBER ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) AS LAG_WEEK_NUMBER
  FROM MY_TABLE T
  ) T
ORDER BY CUSTOMER, WEEK_NUMBER

我能够从以下链接中@dnoeth的答案中获得Teradata中的LAG函数实现:

MAX(WEEK_NUMBER) OVER (PARTITION BY CUSTOMER ORDER BY WEEK_NUMBER ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) AS LAG_WEEK_NUMBER

rows between 1 preceding and preceding 1

Teradata partitioned query ... following rows dynamically

如果您发现答案有任何问题或可以通过任何方式加以改进,请告诉我。