我有以下数据集:
id | bool_col | datetime_col
1 | N | 2017-01-01 00:01:00
2 | N | 2017-01-01 00:02:00
3 | N | 2017-01-01 00:03:00
4 | Y | 2017-01-01 00:04:00
5 | N | 2017-01-01 00:05:00
6 | N | 2017-01-01 00:06:00
7 | N | 2017-01-01 00:07:00
8 | Y | 2017-01-01 00:08:00
9 | N | 2017-01-01 00:09:00
10 | N | 2017-01-01 00:10:00
11 | N | 2017-01-01 00:11:00
12 | N | 2017-01-01 00:12:00
13 | Y | 2017-01-01 00:13:00
我需要添加一个额外的列,其列号用于分隔bool_col中以Y结尾的每个块:
id | bool_col | datetime_col | rank
1 | N | 2017-01-01 00:01:00 | 1
2 | N | 2017-01-01 00:02:00 | 1
3 | N | 2017-01-01 00:03:00 | 1
4 | Y | 2017-01-01 00:04:00 | 1
5 | N | 2017-01-01 00:05:00 | 2
6 | N | 2017-01-01 00:06:00 | 2
7 | N | 2017-01-01 00:07:00 | 2
8 | Y | 2017-01-01 00:08:00 | 2
9 | N | 2017-01-01 00:09:00 | 3
10 | N | 2017-01-01 00:10:00 | 3
11 | N | 2017-01-01 00:11:00 | 3
12 | N | 2017-01-01 00:12:00 | 3
13 | Y | 2017-01-01 00:13:00 | 3
我已尝试过多次领先,滞后和排名的迭代,但仍然没有告诉如何只有在bool_col中有一个Y时才能告诉它增加排名
有什么想法吗?
答案 0 :(得分:2)
只需在每个值之前执行“Y”的累计和。在你的情况下:
select t.*,
(1 + sum(case when bool_col is true then 1 else 0 end) over (order by id rows between unbounded preceding and current row)) as rnk
from t;
注意:这使用is true
,假设列真的是布尔值。否则,请使用= 'Y'
。