我有以下数据,我需要根据用户及其状态获取最小开始日期和最大结束日期。该查询有效,但是运行需要超过55分钟。有什么办法可以有效地编写它?表格中大约有150,000个用户
数据
DECLARE @TBL TABLE (Users INT, Users_Status VARCHAR(5), [Start_Date] DATE, End_Date DATE)
INSERT INTO @TBL VALUES
(1,'A','2019-03-07','2019-03-22'),(1,'A','2019-01-04','2019-01-08'),(1,'A','2019-01-12','2019-01-27'),
(1,'B','2019-01-30','2019-02-02'),(1,'B','2019-02-27','2019-03-13'),(1,'B','2019-01-13','2019-01-24'),
(2,'A','2019-03-15','2019-03-28'),(2,'A','2019-05-19','2019-05-27'),(3,'A','2019-05-31','2019-06-04'),
(3,'A','2019-05-18','2019-06-03'),(3,'A','2019-01-12','2019-01-13'),(3,'A','2019-04-12','2019-05-02'),
(3,'B','2019-01-08','2019-01-18'),(3,'B','2019-04-16','2019-04-18'),(4,'B','2019-05-25','2019-06-03'),
(5,'A','2019-03-26','2019-03-30'),(5,'A','2019-06-13','2019-06-26'),(5,'A','2019-02-02','2019-02-18'),
(5,'B','2019-01-17','2019-01-20'),(5,'B','2019-03-30','2019-04-19'),(5,'B','2019-05-04','2019-05-16'),
(5,'B','2019-03-25','2019-04-10'),(5,'B','2019-03-09','2019-03-27')
我尝试了此查询
;WITH StartEnd AS
(SELECT
*
,ROW_NUMBER()OVER(PARTITION BY Users,Users_Status ORDER BY [Start_Date] ASC) AS Utart
,ROW_NUMBER()OVER(PARTITION BY Users,Users_Status ORDER BY End_Date DESC) AS UEnd
FROM @TBL
) ,Starts AS
(
SELECT
*
FROM StartEnd
WHERE Utart =1
),
Ends AS
(
SELECT
*
FROM StartEnd
WHERE UEnd =1
)
SELECT distinct
S.*
,(SELECT MIN(ST.[Start_Date]) FROM Starts ST WHERE ST.Users = S.Users AND ST.Users_Status =S.Users_Status ) AS Min_Start_Date
,(SELECT MAX(e.End_Date) FROM Ends E WHERE E.Users = S.Users AND E.Users_Status =S.Users_Status ) AS Max_end_Date
FROM StartEnd S
当前输出
所需的输出
答案 0 :(得分:1)
要提高查询性能,要做的第一件事是确保有适当的索引。尝试通过以下任一方法查看查询执行计划:
Display estimated execution plan Button in SSMS
然后在执行计划中添加任何建议的索引。建议的索引将以绿色文本显示。您可以右键单击并选择“缺少索引详细信息”,以在新窗口中获取创建索引脚本。在运行之前,根据需要对其进行修改。
答案 1 :(得分:1)
我相信您的查询可以简化为
SELECT Users,
Users_Status,
Start_date,
End_Date,
MIN(Start_Date) OVER (PARTITION BY Users, Users_Status) Min_Start_Date,
MAX(End_Date) OVER (PARTITION BY Users, Users_Status) Max_End_Date
FROM @tbl
但是,性能更可能下降到索引