我有这样的数据集 -
**Team Date W/L**
Team_1 04/01/0012 W
Team_1 06/01/0012 W
Team_1 07/01/0012 L
Team_1 14/01/0012 W
Team_1 19/01/0012 W
Team_1 30/01/0012 L
Team_1 14/02/0012 W
Team_1 17/02/0012 L
Team_1 20/02/0012 W
Team_2 01/01/0012 W
Team_2 05/01/0012 W
Team_2 09/01/0012 W
Team_2 13/01/0012 L
Team_2 18/01/0012 W
Team_2 25/01/0012 L
Team_2 05/02/0012 L
Team_2 13/02/0012 L
Team_2 19/02/0012 L
Team_3 02/01/0012 W
Team_3 02/01/0012 W
Team_3 06/01/0012 W
Team_3 10/01/0012 W
Team_3 19/01/0012 W
Team_3 31/01/0012 L
Team_3 11/02/0012 W
Team_3 15/02/0012 L
Team_3 21/02/0012 W
由此我需要找出谁拥有最大的连续胜利 -
团队计数
Team_3 5
Team_2 3
Team_1 2
我被允许只写sql查询。我怎么写这个?
答案 0 :(得分:2)
您可以使用以下内容:
SELECT Team, TotalWins, FirstWin, LastWin
FROM ( SELECT Team,
WL,
COUNT(*) TotalWins,
MIN("Date") FirstWin,
MAX("Date") LastWin,
ROW_NUMBER() OVER(PARTITION BY Team, WL ORDER BY COUNT(*) DESC) RowNumber
FROM ( SELECT Team,
"Date",
WL,
ROW_NUMBER() OVER(PARTITION BY Team ORDER BY "Date") - ROW_NUMBER() OVER(PARTITION BY Team, WL ORDER BY "Date") Grouping
FROM T
) GroupedData
WHERE WL = 'W'
GROUP BY Team, WL, Grouping
) RankedData
WHERE RowNumber = 1;
它使用ROW_NUMBER对按团队划分的每个游戏进行排名,并且还通过结果,这两者之间的差异对于每组连续结果是唯一的。所以对于你的第一支队伍你会有:
Team Date W/L RN1 RN2 DIFF
Team_1 04/01/0012 W 1 1 0
Team_1 06/01/0012 W 2 2 0
Team_1 07/01/0012 L 3 1 2
Team_1 14/01/0012 W 4 3 1
Team_1 19/01/0012 W 5 4 1
Team_1 30/01/0012 L 6 2 4
Team_1 14/02/0012 W 7 5 2
Team_1 17/02/0012 L 8 3 5
Team_1 20/02/0012 W 9 6 3
RN1只是由团队划分,而rn2是按团队划分的结果。
正如您所看到的,如果您删除了损失,那么DIFF列会为每组连续胜利增加1:
Team Date W/L RN1 RN2 DIFF
Team_1 04/01/0012 W 1 1 0
Team_1 06/01/0012 W 2 2 0
---------------------------------------
Team_1 14/01/0012 W 4 3 1
Team_1 19/01/0012 W 5 4 1
---------------------------------------
Team_1 14/02/0012 W 7 5 2
---------------------------------------
Team_1 20/02/0012 W 9 6 3
然后,您可以按此分组,以确保您正在查看连续获胜,并进行计数以获得最大收益。然后我就用另一个rownumber来获得每队最大的连续胜利。
<强> Example on SQL Fiddle 强>