根据条件编号

时间:2019-12-11 18:53:17

标签: sql sql-server tsql gaps-and-islands

我正在尝试根据条件生成一个数字。 如果按Start_Date排序的客户端分区中的“停止”列中为是,则密集等级必须重新开始。所以我尝试了几件事,但这不是我想要的。 我的表格中有当前编号和预期编号

+-----------+------------+------+------------+-------------+
| Client_No | Start_Date | Stop | Current_No | Expected_No |
+-----------+------------+------+------------+-------------+
|     1     |  1-1-2018  |  No  |      1     |      1      |
+-----------+------------+------+------------+-------------+
|     1     |  1-2-2018  |  No  |      2     |      2      |
+-----------+------------+------+------------+-------------+
|     1     |  1-3-2018  |  No  |      3     |      3      |
+-----------+------------+------+------------+-------------+
|     1     |  1-4-2018  |  Yes |      1     |      1      |
+-----------+------------+------+------------+-------------+
|     1     |  1-5-2018  |  No  |      4     |      2      |
+-----------+------------+------+------------+-------------+
|     1     |  1-6-2018  |  No  |      5     |      3      |
+-----------+------------+------+------------+-------------+
|     2     |  1-2-2018  |  No  |      1     |      1      |
+-----------+------------+------+------------+-------------+
|     2     |  1-3-2018  |  No  |      2     |      2      |
+-----------+------------+------+------------+-------------+
|     2     |  1-4-2018  |  Yes |      1     |      1      |
+-----------+------------+------+------------+-------------+
|     2     |  1-5-2018  |  No  |      3     |      2      |
+-----------+------------+------+------------+-------------+
|     2     |  1-6-2018  |  Yes |      2     |      1      |
+-----------+------------+------+------------+-------------+

我到目前为止使用的查询:

DENSE_RANK() OVER(PARTITION BY Client_No, Stop ORDER BY Start_Date ASC)

这似乎不是解决方案,因为它从值'no'算起onwart,但我也不知道如何用其他方式处理它。

2 个答案:

答案 0 :(得分:1)

解决此类“缺口与岛屿”难题的一种方法是,首先计算以“是”停靠点开头的排名。

然后再计算该行上的row_number或density_rank。

例如:

create table test 
(
  Id int identity(1,1) primary key,
  Client_No int,
  Start_Date date,
  Stop varchar(3)
)
insert into test 
(Client_No, Start_Date, Stop) values
  (1,'2018-01-01','No')
 ,(1,'2018-02-01','No')
 ,(1,'2018-03-01','No')
 ,(1,'2018-04-01','Yes')
 ,(1,'2018-05-01','No')
 ,(1,'2018-06-01','No')

 ,(2,'2018-02-01','No')
 ,(2,'2018-03-01','No')
 ,(2,'2018-04-01','Yes')
 ,(2,'2018-05-01','No')
 ,(2,'2018-06-01','Yes')
select *
, row_number() over (partition by Client_no, Rnk order by start_date) as rn
from
(
  select *
  , sum(case when Stop = 'Yes' then 1 else 0 end) over (partition by Client_No order by start_date) rnk
  from test
) q
order by Client_No, start_date
GO
Id | Client_No | Start_Date          | Stop | rnk | rn
-: | --------: | :------------------ | :--- | --: | :-
 1 |         1 | 01/01/2018 00:00:00 | No   |   0 | 1 
 2 |         1 | 01/02/2018 00:00:00 | No   |   0 | 2 
 3 |         1 | 01/03/2018 00:00:00 | No   |   0 | 3 
 4 |         1 | 01/04/2018 00:00:00 | Yes  |   1 | 1 
 5 |         1 | 01/05/2018 00:00:00 | No   |   1 | 2 
 6 |         1 | 01/06/2018 00:00:00 | No   |   1 | 3 
 7 |         2 | 01/02/2018 00:00:00 | No   |   0 | 1 
 8 |         2 | 01/03/2018 00:00:00 | No   |   0 | 2 
 9 |         2 | 01/04/2018 00:00:00 | Yes  |   1 | 1 
10 |         2 | 01/05/2018 00:00:00 | No   |   1 | 2 
11 |         2 | 01/06/2018 00:00:00 | Yes  |   2 | 1 

db <>提琴here

使用此方法的区别:

row_number() over (partition by Client_no, Rnk order by start_date)

与此相反:

dense_rank() over (partition by Client_no, Rnk order by start_date)

是density_rank将根据Client_no&Rnk在相同的起始日期计算相同的数字。

答案 1 :(得分:1)

以下是一种为您提供所需输出的方法。您可以看到here的实时演示/工作演示。

涉及的步骤是:

  1. 创建一个调整后的止损值,在该值中,我们为每个客户的第一行将“止损”标记为“是”
  2. 创建一个单独的表,仅包含我们要开始/重新开始计数的行
  3. 对于此新表中的每一行,我们还添加一个结束数据,该数据基本上是每个客户的下一行的日期,或者是最后一行中将来的日期
  4. 我们将原始数据表与新表连接起来,并基于此新计算运行一个序列
-- 1. Creating adjusted stop value
data_adjusted_stop as
(
select      *,
            case when row_number() over(partition by Client_No order by Start_Date asc) = 1 then 'Yes' else Stop end as adjusted_stop
from        data
),

-- 2. Extracting the rows where we will want to (re)start the counting
data_with_cycle as
(
select      Client_No,
            row_number() over(partition by Client_No order by Start_Date asc) adjusted_stop_cycle,
            Start_Date
from        data_adjusted_stop
where       adjusted_stop = 'Yes'
),

-- 3. Adding an End_Date column for each row where we will want to (re)start counting
data_with_end_date as
(
select      *,
            coalesce(lead(Start_Date) over (partition by Client_No order by Start_Date asc), '2021-01-01') as End_Date
from        data_with_cycle
)

-- 4. Running a sequence partitioned by Client_No and the stop cycle
select      data.*,
            row_number() over(partition by data.Client_No,      data_with_end_date.adjusted_stop_cycle order by data.Start_Date asc) as desired_output_sequence
from        data
left join   data_with_end_date
            on data_with_end_date.Client_no = data.Client_no
where       data.Start_Date >= data_with_end_date.Start_Date
and         data.Start_Date < data_with_end_date.End_Date 
order by    1, 2