Question

我正在尝试在下面的数据集上写一个查询来添加一个新的列，其中包含某种“period_id_group”。

contiguous  new_period  row_nr  new_period_starting_id
0           0           1       0
1           1           2       2
1           0           3       0
1           0           4       0
1           1           5       5
1           0           6       0

我想要的是：

contiguous  new_period  row_nr  new_period_starting_id    period_id_group
0           0           1       0                         0
1           1           2       2                         2
1           0           3       0                         2
1           0           4       0                         2
1           1           5       5                         5
1           0           6       0                         5

逻辑是，对于new_period_starting_id中的每个0值，它必须从上面的行中获取>0值。

因此，对于row_nr = 1，由于前面没有行，period_id_group为0。

对于row_nr = 2，因为这是一个新的perid（由new_period = 1标记），period_id_group为2（此行的ID）。

对于row_nr = 3，因为它是连续范围的一部分（因为contiguous = 1），但不是范围的开头，因为它不是new_period（new_period = 0），而是period_id_group 1}}应该继承前一行的值（这是连续范围的开始） - 在这种情况下也是period_id_group = 2。

我尝试了多个版本，但无法为SQL Server 2008R2获得良好的解决方案，因为我无法使用LAG()。

到目前为止，我所拥有的是可耻的：

select *
from #temp2 t1
left join (select distinct new_period_starting_id from #temp2) t2 
    on t1.new_period_starting_id >= t2.new_period_starting_id
where 1 = case 
            when contiguous = 0 
                then 1
            when contiguous = 1 and t2.new_period_starting_id > 0
                then 1
            else 1
        end
order by t1.rn

示例数据脚本：

declare @tmp2 table (contiguous int
                   , new_period int
                   , row_nr int
                   , new_period_starting_id int);

insert into @tmp2 values (0, 0, 1, 0)
                        , (1, 1, 2, 2)
                        , (1, 0, 3, 0)
                        , (1, 0, 4, 0)
                        , (1, 1, 5, 5)
                        , (1, 0, 6, 0);

感谢任何帮助。

Answer 1

所以，如果我正确理解你，你只需要一个额外的专栏。

SELECT t1.contiguous, t1.new_period, t1.row_nr, t1.new_period_starting_id,
    (SELECT TOP 1 (new_period_starting_id) 
     FROM YourTable t2
     WHERE t2.row_nr <= t1.row_nr
         AND t2.period_id_group > 0 /* optimization */
     ORDER BY t2.row_nr DESC /* optimization */) AS period_id_group
FROM YourTable t1

Answer 2

这是另一种选择。

select t1.contiguous
    , t1.new_period
    , t1.row_nr
    , t1.new_period_starting_id
    , x.new_period_starting_id
from @tmp2 t1
outer apply
(
    select top 1 *
    from @tmp2 t2
    where (t2.row_nr = 1
        or t2.new_period_starting_id > 0)
        and t1.row_nr >= t2.row_nr
    order by t2.row_nr desc
) x

Answer 3

找到解决方案：

select *
    , case
        when contiguous = 0
            then f1
        when contiguous = 1 and new_periods = 1
            then f1
        when contiguous = 1 and new_periods = 0
            then v
        else NULL
    end [period_group]
from (
    select *
        , (select max(f1) from #temp2 where new_period_starting_id > 0 and rn < t1.rn) [v]
    from #temp2 t1
    ) rs
order by rn

查询以识别连续范围

3 个答案: