要使以下查询快速运行,我应将哪些列添加到索引中?
select max(id) from tb
group by BranchId,ArticleID
having count(*)>1
答案 0 :(得分:4)
由于count(*) > 1
一词,您的当前查询可能无法从任何索引中受益,SQL Server可能会将其解释为获取每个组的全部计数。但是,我们可以按以下方式重写您的查询,以便它可以使用索引:
SELECT MAX(id)
FROM tb
GROUP BY BranchId, ArticleID
HAVING MIN(id) <> MAX(id);
然后,添加以下索引:
CREATE INDEX idx ON tb (BranchId, ArticleID, Id);
此处的窍门是将count(*) > 1
重新定义为在逻辑上等同于组 中最小和最大的id
值。请注意,我在这里假设id
是唯一列,即给定的组永远不会有两个或更多具有相同id
值的记录。
答案 1 :(得分:1)
除了Tim已经在ROWSTORE索引中讨论的内容之外,我还想在此做一些补充说明并进行演示:
只要您能够以“ 已搜索”列(其中where子句,联接,分组依据,排序依据,不同)使用的方式编写查询, strong> seargable (当查询可以使用索引列时),ROWSTORE索引就可以了。但是,在许多情况下,您可能难以尝试保持相同或有更好的方法,其中一些是:
然后,使用COLUMNSTORE索引可以大大提高性能,因为它对来自相似域的列值进行了海量数据压缩。在SQL Server 2017中,批处理执行可以进一步提高性能。列存储索引还可以提供更好的索引重组结果。
我从布伦特·奥扎尔(Brent Ozar)的演讲中引用了以上几点。
这里是一个演示,其中我准备了2个完全相同的表,每个表具有1000万行,并且具有一定的基数。
数据准备:
--Test table1
drop table if exists dbo.dummy
select top 10000000 objectid1 = a.object_id, Name1 = a.name, objectid2 = b.object_id, Name2 = b.name, objectid3 = c.object_id, Name3 = c.name
into dbo.dummy
from sys.objects a
cross join sys.objects b
cross join sys.objects c
order by a.object_id, a.name
drop index if exists ix_dummy on dbo.dummy
go
--create a nonclustered rowstore index
create index ix_dummy on dbo.dummy (objectid1, objectid2, objectid3)
go
--Test Table2
drop table if exists dbo.dummy2
select top 10000000 objectid1 = a.object_id, Name1 = a.name, objectid2 = b.object_id, Name2 = b.name, objectid3 = c.object_id, Name3 = c.name
into dbo.dummy2
from sys.objects a
cross join sys.objects b
cross join sys.objects c
order by a.object_id, a.name
drop index if exists ix_dummy2 on dbo.dummy2
go
--create a nonclustered columnstore index
create nonclustered columnstore index ix_dummy2 on dbo.dummy2 (objectid1, objectid2, objectid3)
go
set statistics io on
set statistics time on
百分比和读数表示Columnstore Index是赢家,而经过时间表示Rowstore Index是赢家
--Simple search
--Run these 2 queries together and compare their percantage of time taken with respect each other, logical read, elapsed time.
select objectid3
from dbo.dummy
where objectid1 in (5) -- look for some object_id that exists in your database
select objectid3
from dbo.dummy2
where objectid1 in (5) -- look for some object_id that exists in your database
逻辑读取和经过的时间:
执行计划:
--Agregate queries
----Run these 2 queries together and compare their percantage of time taken with respect each other, logical read, elapsed time.
select max(objectid3)
from dbo.dummy
group by objectid1, objectid2
having max(objectid3) <> min(objectid3)
select max(objectid3)
from dbo.dummy2
group by objectid1, objectid2
having max(objectid3) <> min(objectid3)
逻辑读取和经过的时间: