假设我们有下表,其中每一行代表用户在编程竞赛中提交的内容,id
是自动递增的主键,probid
标识提交内容所针对的问题,score
是提交问题所获得的分数,date
是提交问题的时间戳。每个用户可以针对自己的问题提交任意多次:
+----+----------+--------+-------+------------+
| id | username | probid | score | date |
+----+----------+--------+-------+------------+
| 1 | brian | 1 | 5 | 1542766686 |
| 2 | alex | 1 | 10 | 1542766686 |
| 3 | alex | 2 | 5 | 1542766901 |
| 4 | brian | 1 | 10 | 1542766944 |
| 5 | jacob | 2 | 10 | 1542766983 |
| 6 | jacob | 1 | 10 | 1542767053 |
| 7 | brian | 2 | 8 | 1542767271 |
| 8 | jacob | 2 | 10 | 1542767456 |
| 9 | brian | 2 | 7 | 1542767522 |
+----+----------+--------+-------+------------+
为了对参赛者进行排名,我们需要确定每个用户对每个问题的最佳提交方式。 “最好”的提交是得分最高的,并且由提交ID打破联系(即,如果用户在同一问题上两次获得相同的分数,则我们只关心这两个提交中的较早者)。这将产生一个如下表:
+----------+--------+----+-------+------------+
| username | probid | id | score | date |
+----------+--------+----+-------+------------+
| alex | 1 | 2 | 10 | 1542766686 |
| alex | 2 | 3 | 5 | 1542766901 |
| brian | 1 | 4 | 10 | 1542766944 |
| brian | 2 | 7 | 8 | 1542767271 |
| jacob | 1 | 6 | 10 | 1542767053 |
| jacob | 2 | 5 | 10 | 1542766983 |
+----------+--------+----+-------+------------+
如何编写查询来完成此操作?
答案 0 :(得分:0)
SELECT username , probid , id , score , `date`
FROM tableName
ORDER BY username, score DESC, ID
答案 1 :(得分:0)
使用MySQL-8.0或MariaDB-10.2或更高版本:
SELECT username, probid, id, score, `date`
FROM (
SELECT username, probid, id, score, `date`,
ROW_NUMBER() over (
PARTITION BY username,probid
ORDER BY score DESC) as `rank`
FROM tablename
) as tmp
WHERE tmp.`rank` = 1
答案 2 :(得分:0)
此查询也适用于8.0之前的MySQL版本。 LEFT JOIN
删除重复的分数,以确保相等分数在给定分数中的结果集中只有最低的日期。然后WHERE
子句可确保我们在给定的用户/问题组合中获得最高分:
SELECT t1.username, t1.probid, t1.id, t1.score, t1.date
FROM tablename t1
LEFT JOIN tablename t2
ON t2.username = t1.username AND
t2.probid = t1.probid AND
t2.score = t1.score AND
t2.date < t1.date
WHERE t2.id IS NULL AND
t1.score = (SELECT MAX(score) FROM tablename t3 WHERE t3.username = t1.username AND t3.probid = t1.probid)
ORDER BY t1.username, t1.probid
更新
将表JOIN
首先列出每个问题每个用户的最高得分列表,而不是为结果表中的每一行计算MAX
值,几乎可以肯定是更有效率的。该查询改为:
SELECT t1.username, t1.probid, t1.id, t1.score, t1.date
FROM tablename t1
JOIN (SELECT username, probid, MAX(score) AS score
FROM tablename
GROUP BY username, probid) t2
ON t2.username = t1.username AND
t2.probid = t1.probid AND
t2.score = t1.score
LEFT JOIN tablename t3
ON t3.username = t1.username AND
t3.probid = t1.probid AND
t3.score = t1.score AND
t3.date < t1.date
WHERE t3.id IS NULL
ORDER BY t1.username, t1.probid
输出(对于两个查询):
username probid id score date
alex 1 2 10 1542766686
alex 2 3 5 1542766901
brian 1 4 10 1542766944
brian 2 7 8 1542767271
jacob 1 6 10 1542767053
jacob 2 5 10 1542766983
答案 3 :(得分:0)
在MySQL 8.0.2之前的版本中,我们可以使用Row_Number()
模拟User-defined Variables的功能。在此technique中,我们首先以特定顺序获取数据(取决于手头的问题陈述)。
在您的情况下,在probid
和username
的分区中,我们需要按降序对分数进行排名,时间戳记值较低的行具有更高的优先级(打破平局)。因此,我们将ORDER BY probid, username, score DESC, date ASC
。
现在,我们可以将此结果集用作Derived Table,并确定行号。就像循环技术(我们在应用程序代码中使用的,例如:PHP)一样。我们将前一行的值存储在用户定义的变量中,并使用条件CASE .. WHEN
表达式来根据前一行检查当前行的值。然后,相应地分配行号。
最终,我们将仅考虑行号为1 的行,并(如果需要)按username
和probid
对其进行排序。
查询
SELECT dt2.username,
dt2.probid,
dt2.id,
dt2.score,
dt2.date
FROM (SELECT @rn := CASE
WHEN @un = dt1.username
AND @pid = dt1.probid THEN @rn + 1
ELSE 1
end AS row_no,
@un := dt1.username AS username,
@pid := dt1.probid AS probid,
dt1.id,
dt1.score,
dt1.date
FROM (SELECT id,
username,
probid,
score,
date
FROM your_table
ORDER BY username,
probid,
score DESC,
date ASC) AS dt1
CROSS JOIN (SELECT @un := '',
@pid := 0,
@rn := 0) AS user_init_vars) AS dt2
WHERE dt2.row_no = 1
ORDER BY dt2.username, dt2.probid;
结果
| username | probid | id | score | date |
| -------- | ------ | --- | ----- | ---------- |
| alex | 1 | 2 | 10 | 1542766686 |
| alex | 2 | 3 | 5 | 1542766901 |
| brian | 1 | 4 | 10 | 1542766944 |
| brian | 2 | 7 | 8 | 1542767271 |
| jacob | 1 | 6 | 10 | 1542767053 |
| jacob | 2 | 5 | 10 | 1542766983 |