在Redshift中,如何运行与SUM函数相反的操作

时间:2017-01-19 01:55:22

标签: sql amazon-redshift

假设我有一个数据表

date        |   user_id  | user_last_name | order_id  | is_new_session
------------+------------+----------------+-----------+---------------
 2014-09-01 | A          | B              | 1         | t
 2014-09-01 | A          | B              | 5         | f
 2014-09-02 | A          | B              | 8         | t
 2014-09-01 | B          | B              | 2         | t
 2014-09-02 | B          | test           | 3         | t
 2014-09-03 | B          | test           | 4         | t
 2014-09-04 | B          | test           | 6         | t
 2014-09-04 | B          | test           | 7         | f
 2014-09-05 | B          | test           | 9         | t
 2014-09-05 | B          | test           | 10        | f

我想在Redshift中获得另一个列,它基本上为每个用户会话分配会话号。它从每个用户的第一条记录开始,当你向下移动时,如果它在" is_new_session"中遇到真值。列,它递增。如果遇到错误则保持不变。如果它命中新用户,则该值将重置为1.此表的理想输出为:

1
1
2
1
2
3
4
4
5
5

在我看来,它与SUM(1) over (Partition BY user_id, is_new_session ORDER BY user_id, date ASC)

相反

有什么想法吗?

谢谢!

1 个答案:

答案 0 :(得分:2)

我认为你想要一个增量总和:

select t.*,
       sum(case when is_new_session then 1 else 0 end) over (partition by user_id order by date) as session_number
from t;

在Redshift中,您可能需要窗口条款:

select t.*,
       sum(case when is_new_session then 1 else 0 end) over
           (partition by user_id
            order by date
            rows between unbounded preceding and current row
           ) as session_number
from t;