我有一个简单的表评论(id INT, revision INT, comment VARCHAR(140))
,其中包含以下内容:
1|1|hallo1|
1|2|hallo2|
1|3|hallo3|
2|1|hallo1|
2|2|hallo2|
我正在搜索一条SQL语句,该语句将返回具有最高版本的每条评论:
1|3|hallo3|
2|2|hallo2|
我想出了这个解决方案:
select id, revision, comment
from comments
where revision = (
select max(revision)
from comments as f
where f.id = comments.id
);
但是在大型数据集上它非常慢。有没有更好的查询来实现这个目标?
答案 0 :(得分:11)
这是一种方法,通过适当的索引不会非常慢,并且它不使用子选择:
SELECT comments.ID, comments.revision, comments.comment FROM comments
LEFT OUTER JOIN comments AS maxcomments
ON maxcomments.ID= comments.ID
AND maxcomments.revision > comments.revision
WHERE maxcomments.revision IS NULL
改编自此处的查询: http://www.xaprb.com/blog/2007/03/14/how-to-find-the-max-row-per-group-in-sql-without-subqueries/
(来自谷歌搜索:sql的max group)
答案 1 :(得分:6)
确保已正确设置索引。对id进行索引,修改会很好。
以下是对您的查询的不同看法。尚未检查其执行计划,但如果您设置好索引,它应该有所帮助:
SELECT c.*
FROM comments c
INNER JOIN (
SELECT id,max(revision) AS maxrev
FROM comments
GROUP BY id
) b
ON c.id=b.id AND c.revision=b.maxrev
编辑添加:
再次编辑以添加信息:
Subquery:
25157 records
2 seconds
Execution plan includes an Index Seek (82%) base and a Segment (17%)
Left Outer Join:
25160 records
3 seconds
Execution plan includes two Index Scans @ 22% each with a Right Outer Merge at 45% and a Filter at 11%
我仍然会使用子查询。
答案 2 :(得分:4)
使用我们的一个表测试,总共有近100万行。索引存在于FIELD2和FIELD3两个字段中。查询在我们的开发框中在3秒内返回83953行。
select
FIELD1, FIELD2, FIELD3
from
OURTABLE (nolock) T1
WHERE FIELD3 =
(
SELECT MAX(FIELD3) FROM
OURTABLE T2 (nolock)
WHERE T1.FIELD2=T2.FIELD2
)
ORDER BY FIELD2 DESC
答案 3 :(得分:1)
分析将是我的建议。
select id, max_revision, comment
from (select c.id, c.comment, c.revision, max(c.revision)over(partition by c.id) as max_revision
from comments c)
where revision = max_revision;
答案 4 :(得分:0)
来自左侧字段的想法,但是如何在表格中添加额外的字段:
CurrentRevision bit not null
然后,当您进行更改时,请在新版本上设置标记,并将其删除所有以前的版本。
您的查询将变为:
select Id,
Comment
from Comments
where CurrentRevision = 1
这对数据库来说要容易得多,因此要快得多。
答案 5 :(得分:0)
一种非常干净的方式来做“最新的x by id”类型查询就是这样。正确索引也应该很容易。
SELECT id, revision, comment
FROM comments
WHERE (id, revision) IN (
SELECT id, MAX(revision)
FROM comments
-- WHERE clause comes here if needed
GROUP BY id
)
答案 6 :(得分:0)
对于大表,我发现这个解决方案可以有更好的性能:
SELECT c1.id,
c1.revision,
c1.comment
FROM comments c1
INNER JOIN ( SELECT id,
max(revision) AS max_revision
FROM comments
GROUP BY id ) c2
ON c1.id = c2.id
AND c1.revision = c2.max_revision
答案 7 :(得分:0)
没有子选择(或临时表):
SELECT c1.ID, c1.revision, c1.comment
FROM comments AS c1
LEFT JOIN comments AS c2
ON c1.ID = c2.ID
AND c1.revision < c2.revision
WHERE c2.revision IS NULL