我有以下数据集:
id | date | state
-----------------------
1 | 01/01/17 | high
1 | 02/01/17 | high
1 | 03/01/17 | high
1 | 04/01/17 | miss
1 | 05/01/17 | high
2 | 01/01/17 | miss
2 | 02/01/17 | high
2 | 03/01/17 | high
2 | 04/01/17 | miss
2 | 05/01/17 | miss
2 | 06/01/17 | high
我想要创建一个列rank_state
,在id
组中按照增加date
(从排名0开始)对条目进行排名不要拥有" miss" state
。此外,如果条目具有{#1}}" miss",则等级重复。输出应如下所示:
state
例如,第4行的等级为2,因为它的 id | date | state | rank_state
------------------------------------
1 | 01/01/17 | high | 0
1 | 02/01/17 | high | 1
1 | 03/01/17 | high | 2
1 | 04/01/17 | miss | 2
1 | 05/01/17 | high | 3
2 | 01/01/17 | miss | 0
2 | 02/01/17 | high | 0
2 | 03/01/17 | high | 1
2 | 04/01/17 | miss | 1
2 | 05/01/17 | miss | 1
2 | 06/01/17 | high | 2
是"未命中",即它重复第3行的等级(同样适用于第9行)和10)。请注意,第6行和第7行的等级为0.
我尝试过以下方法:
state
和
,(case when state is not in ('miss') then (rank() over (partition by id order by date desc) - 1) end) as state_rank
但是没有给我预期的结果。任何想法都会非常有用。
答案 0 :(得分:2)
你想要的很可能:
SELECT *,
GREATEST(
COUNT(case when state != 'miss' then 1 else null end)
OVER(PARTITION BY id ORDER BY date) - 1,
0
) as "state_rank"
FROM tbl;
基本上:
id
'miss'
GREATEST
使用0(防止否定)答案 1 :(得分:0)
只需将frame_clause添加到vol7ron的答案,因为Redshift需要它:
select *
, GREATEST(COUNT(case when state != 'miss' then 1 else null end)
OVER(PARTITION BY id order by date rows between unbounded preceding and current row) -1 , 0 ) as state_rank
from tbl;