在行上合并滞后/引线?

时间:2016-04-25 17:56:14

标签: sql postgresql

嘿伙计们假设我有一个数据框

    Year   Month   1_month_sub   3_month_sub   12_month_sub
    2014     1         3             1              1
    2014     2         1             0              0
    2014     3         1             0              0
    2014     4         1             0              0
    2014     5         4             0              0
    2014     6         1             0              0
    2014     7         5             0              0
    2014     8         1             0              0
    2014     9         1             0              0
    2014     10        6             0              0
    2014     11        1             0              0
    2014     12        3             0              0

如果1_month sub表示购买了1个月订阅,则3个月sub表示购买了3个月订阅等。 我需要添加一个列,在任何给定的单位时间内为我提供每月订阅者数量。因此结果如下:

    Year   Month   1_month_sub   3_month_sub   12_month_sub  subs
    2014     1         3             1              1         5
    2014     2         1             0              0         3
    2014     3         1             0              0         3
    2014     4         1             0              0         2
    2014     5         4             0              0         5
    2014     6         1             0              0         2
    2014     7         5             0              0         6
    2014     8         1             0              0         2
    2014     9         1             0              0         2
    2014     10        6             0              0         7
    2014     11        1             0              0         2
    2014     12        3             0              0         4
    2015      1        1             0              0         1

我使用了COALESCE,LAG,LEAD功能并没有取得真正的成功。关于如何处理这个问题的任何想法?

3 个答案:

答案 0 :(得分:2)

我推测数据在一个表格中,1个月只存在一个月,3个月为3个月,12个月为12个月。

而且,我将假设每个月都有一行。

您可以在Postgres中使用累积和的窗口子句来执行此操作:

select t.*,
       (1_month_sub +
        sum(3_month_sub) over (order by year rows between 2 preceding and current row) +
        sum(12_month_sub) over (order by year rows between 11 preceding and current row)
       ) as total_subs
from t;

答案 1 :(得分:0)

你尝试过这样的事吗?

编辑: @Gordon Linoff下面的答案更好 - 同样的想法,但在单个表达式中封装了“前3个”和“前12个”值!

select subs=(1_month_sub) + 
            lag(3_month_sub, 2, 0) over (order by Year, Month) +
            lag(3_month_sub, 1, 0) over (order by Year, Month) + 
            3_month_sub + 
            lag(12_month_sub, 11, 0) over (order by Year, Month) + 
            lag(12_month_sub, 10, 0) over (order by Year, Month) + 
            lag(12_month_sub, 9, 0) over (order by Year, Month) + 
            lag(12_month_sub, 8, 0) over (order by Year, Month) + 
            lag(12_month_sub, 7, 0) over (order by Year, Month) + 
            lag(12_month_sub, 6, 0) over (order by Year, Month) + 
            lag(12_month_sub, 5, 0) over (order by Year, Month) + 
            lag(12_month_sub, 4, 0) over (order by Year, Month) + 
            lag(12_month_sub, 3, 0) over (order by Year, Month) + 
            lag(12_month_sub, 2, 0) over (order by Year, Month) + 
            lag(12_month_sub, 1, 0) over (order by Year, Month) + 
            12_month_sub
from MyTable

答案 2 :(得分:0)

没有电源功能就可以做到这一点。下面已经在PostgreSQL和MS SQL上进行了测试。

请参阅SQL Fiddle工作原理:http://sqlfiddle.com/#!15/74862/4/0

简单SQL加入

select
t1.Year,
t1.Month,
    sum(case when ((t2.Year-2014)*12+t2.Month) <= ((t1.Year-2014)*12+t1.Month) and ((t2.Year-2014)*12+t2.Month) - ((t1.Year-2014)*12+t1.Month) > -1 then 1 else 0 end * t2.one_ms) +
    sum( case when ((t2.Year-2014)*12+t2.Month) <= ((t1.Year-2014)*12+t1.Month) and ((t2.Year-2014)*12+t2.Month) - ((t1.Year-2014)*12+t1.Month) > -3 then 1 else 0 end * t2.three_ms ) +
    sum( case when ((t2.Year-2014)*12+t2.Month) <= ((t1.Year-2014)*12+t1.Month) and ((t2.Year-2014)*12+t2.Month) - ((t1.Year-2014)*12+t1.Month) > -12 then 1 else 0 end * t2.twelve_ms ) as subs
from Test t1
    join Test t2
        on 1=1
group by t1.Year, t1.Month, ((t1.Year-2014)*12+t1.Month)
order by ((t1.Year-2014)*12+t1.Month)

以及笛卡尔积的以下特征函数:

  1 2 3 4 5
 +---------+
1|O . . . .|
2|O O . . .|
3|O O O . .|
4|. O O O .|
5|. . O O O|
 +---------+

工作:

year    month   subs
2014        1       5
2014        2       3
2014        3       3
2014        4       2
2014        5       5
2014        6       2
2014        7       6
2014        8       2
2014        9       2
2014        10      7
2014        11      2
2014        12      4
2015        1       1

为了更好地理解它,您可能希望为(t1.Year-2014)*12+t1.Month提供别名,例如num

alter table Test add column num int NULL

update Test
set num = (Year-2014)*12+Month

select
    t1.Year,
    t1.Month,
    sum(case when t2.num <= t1.num and t2.num - t1.num > -1 then 1 else 0 end * t2.one_ms) +
    sum( case when t2.num <= t1.num and t2.num - t1.num > -3 then 1 else 0 end * t2.three_ms ) +
    sum( case when t2.num <= t1.num and t2.num - t1.num > -12 then 1 else 0 end * t2.twelve_ms ) as subs
from Test t1
    join Test t2
        on 1=1
group by t1.Year, t1.Month, t1.num
order by t1.num