此查询大约需要01:30
才能运行:
select DATEADD(dd, 0, DATEDIFF(dd, 0, t1.[OccurredOn]))
, count(t2.UserId)
, count(*) - count(t2.UserId)
from Events t1
left join (select c.UserId, min(c.OccurredOn) FirstOccurred
from Events c
where [OccurredOn] between @start and @end
group by c.UserId) t2 on t1.OccurredOn = t2.FirstOccurred and t1.UserId = t2.UserId
where t1.EventType = @eventType
and t1.[OccurredOn] between @start and @end
group by DATEADD(dd, 0, DATEDIFF(dd, 0, t1.[OccurredOn]))
order by DATEADD(dd, 0, DATEDIFF(dd, 0, t1.[OccurredOn]))
如果我从子查询中删除WHERE
子句,它会立即运行。
使用WHERE
自行运行子查询需要< 1秒
如果我SELECT
将子查询首先放入表变量,并加入到该变量,则整个查询将在19s内运行。
Events
表格如下:
[Events](
[EventType] [uniqueidentifier] NOT NULL,
[UserId] [uniqueidentifier] NOT NULL,
[OccurredOn] [datetime] NOT NULL,
)
我有以下primary, nonclustered, nounique
索引:
继承执行计划
使用SQL Server 2008
两件事:
由于
答案 0 :(得分:1)
您的查询速度很慢,因为您的排序取决于动态计算(DATEADD(dd, 0, DATEDIFF(dd, 0, t1.[OccurredOn]))
),Sql Server无法在即时计算中使用索引。
Postgresql有index on expression,使用Postgresql,你基本上可以将表达式的结果保存到实际的列(幕后列),所以当时机到来时你需要对该表达式进行排序,Postgresql可以在该表达式上使用索引。
Sql Server中最接近的类似功能是持久化公式。
您可以通过此示例查询轻松验证该功能:
create table PersonX
(
Lastname varchar(50) not null,
Firstname varchar(50) not null
);
create table PersonY
(
Lastname varchar(50) not null,
Firstname varchar(50) not null
);
alter table PersonX add Fullname as Lastname + ', ' + Firstname PERSISTED;
create index ix_PersonX on PersonX(Fullname);
declare @i int = 0;
while @i < 10000 begin
insert into PersonX(Lastname,Firstname) values('Lennon','John');
insert into PersonY(Lastname,Firstname) values('Lennon','John');
set @i = @i + 1;
end;
select top 1000 Lastname, Firstname
from PersonX
order by Fullname;
select top 1000 Lastname, Firstname
from PersonY
order by Lastname + ', ' + Firstname;
在PersonX上对fullname执行订单比PersonY快。 PersonX的查询成本仅为32%,而PersonY为68%
要解决查询的性能,请执行以下操作:
alter table Events
add OccurenceGroup as
DATEADD(dd, 0, DATEDIFF(dd, 0, [OccurredOn])) PERSISTED
create index ix_Events on Events(OccurenceGroup);
然后在OccurenceGroup上进行分组和排序。
顺便说一句,您是否在OccuredOn上添加了索引,还在EventType上添加了索引?
答案 1 :(得分:1)
您可以尝试将LEFT JOIN
替换为LEFT MERGE JOIN
,这样派生的表t2
只需计算一次,而不是每个用户可能多次重新计算MIN
。
你也可以使用排名函数重写这个,如下所示。它可能更便宜。您需要根据数据和索引测试这些想法。
;WITH T AS
(
SELECT *,
RANK() OVER (PARTITION BY UserId ORDER BY OccurredOn) AS Rnk
FROM Events
WHERE [OccurredOn] BETWEEN @start AND @end
)
SELECT Dateadd(dd, 0, Datediff(dd, 0, OccurredOn)),
COUNT(CASE WHEN Rnk =1 THEN 1 END),
COUNT(CASE WHEN Rnk >1 THEN 1 END)
FROM T
WHERE EventType = @eventType
GROUP BY Dateadd(dd, 0, Datediff(dd, 0, OccurredOn))
ORDER BY Dateadd(dd, 0, Datediff(dd, 0, OccurredOn))