我有以下数据由player_id和match_date排序。我想找出连续运行次数最多的记录组(2014-04-03至2014-04-12连续3次运行4次)
player_id match_date runs
1 2014-04-01 5
1 2014-04-02 55
1 2014-04-03 4
1 2014-04-10 4
1 2014-04-12 4
1 2014-04-14 3
1 2014-04-19 4
1 2014-04-20 44
2 2014-04-01 23
2 2014-04-02 23
2 2014-04-03 23
2 2014-04-10 23
2 2014-04-12 4
2 2014-04-14 3
2 2014-04-19 23
2 2014-04-20 1
我提出了以下SQL:
select *,row_number() over (partition by ranked.player_id,ranked.runs
order by ranked.match_date) as R from (
select player_id ,match_date,runs from players order by 1,2 desc )
ranked order by ranked.player_id, match_date asc
但是继续之前连续跑的排名(2014-04-19的4次运行,玩家1预计会获得等级1,但得到等级4,因为有3次出现分区已经)。类似地,2014-04-19的玩家2的23次运行预计将获得等级1,但是因为此玩家已经有4次出现23次运行,所以获得等级5。
当运行值从上一行更改时,如何重置将等级重置为1?
架构,数据,SQL和输出可在 SQLFiddle 上找到。
答案 0 :(得分:1)
select p1.player_id, p1.match_date, p1.runs, count(p2.match_date) from players p1
join players p2 on p1.player_id = p2.player_id
and p1.match_date >= p2.match_date
and p1.runs = p2.runs
and not exists (
select 1 from players p3
where p3.runs <> p2.runs
and p3.player_id = p2.player_id
and p3.match_date < p1.match_date
and p3.match_date > p2.match_date
)
group by p1.player_id, p1.match_date, p1.runs
order by p1.player_id, p1.match_date
答案 1 :(得分:1)
您可以使用窗口功能执行此操作。
select player_id, runs, count(*) as numruns
from (select p.*,
(row_number() over (partition by player_id order by match_date) -
row_number() over (partition by player_id, runs order by match_date)
) as grp
from players p
) pg
group by grp, player_id, runs
order by numruns desc
limit 1;
关键的观察是“按序列运行”具有以下属性:如果按日期枚举行(对于每个玩家)并枚举每个玩家的行和按日期运行,则差异是不变的当运行完全相同且有序时。这形成了一个可用于聚合的组,以识别您想要的玩家。
Here是SQL小提琴。