我需要在一个主表上连接多个表,我需要从中获取所有行。
我是否应该在同一个查询中为所有人进行左连接,还是应该使用临时表,物理中间表或窗口函数?
目前,对于带有索引(查询引擎建议的)和优化的字段数据类型的大约85K行,查询将花费很长时间。
我选择追求临时表。通过使用其中的五个,我可以通过在临时表之间分配查询来大幅减少查询时间。我还添加了一个临时表来放入所有子查询,大大减少了每个子查询的查询时间。
以下是原始查询:
CREATE VIEW [dwh].[Facts Tickets LVL 2 V]
AS
SELECT
/* Level 1 fields */
T.[Ticket ID]
,T.[Brand ID]
,T.[Category ID]
,T.[Channel ID]
,T.[Custom field ID]
,T.[Brand Name]
,T.[Company Group Name]
,T.[Ticket creator User ID]
,T.[Created (datetime)]
,T.[Ticket URL]
,T.[Shared URL]
,T.[Ticket type]
,T.[Status group]
,T.[Importance]
,T.[Allow channelback]
,T.[Has incidents]
,T.[Is Hidden]
,T.[Has draft-reply]
,T.[Has staff answer]
,T.[Is Assigned]
,T.[Is Assigned to bot]
,T.[Is Deleted]
,T.[Is Expired]
,T.[Is Locked]
,T.[Is Spam]
,T.[Has Attachments]
,T.[Has Satisfaction entry]
,T.[Ticket age (days)]
,T.[Ticket age (group)]
,T.[Replies (count)]
,T.[Comments (count)]
,T.[Rows (count)]
,T.[Datasync ID]
,T.[DWH Processing (datetime)]
,T.[DWH Status]
/* Level 2 fields */
,RAC.[Replies by Agent (count)] -- +2 sec
,RATC.[Replies by Ticket Creator (count)] -- +2 sec
,FCR.[First Customer-reply (datetime)] -- +2 sec
,LCR.[Last Customer-reply (datetime)] -- +1 sec
,LCR.[Due (datetime)] -- +0 sec
,FAR.[First Agent-reply (datetime)] -- +1 sec
,LAR.[Agent User ID] -- ++++
,LAR.[Last Agent-reply (datetime)] -- ++++
,LAR.[Updated (datetime)] -- ++++
,TS.[Satisfaction, scored] -- ++++
,TCWT.[Ticket creator wait time (minutes)]
,AWT.[Agent wait time (minutes)]
,ARS.[Agent total wait time (minutes)]
,ARS.[Ticket creator total wait time (minutes)]
FROM [dwh].[Facts Tickets LVL 1 T] AS T --85K in 3 sec
/* ################## Ticket satisfaction queries ################## */
/* */
LEFT JOIN ( SELECT [Ticket ID], [Satisfaction, scored] -- 317 0 sec, 317 1 sec
FROM [dwh].[Facts Ticket Satisfactions LVL 1 V]
WHERE [Ticket Satisfaction ID] IN (
SELECT MAX([Ticket Satisfaction ID])
FROM [dwh].[Facts Ticket Satisfactions LVL 1 V]
GROUP BY [Ticket ID] ) ) as TS
ON T.[Ticket ID] = TS.[Ticket ID]
/* ################## Response-statistics queries ################## */
/* Ticket creator wait time */
LEFT JOIN ( SELECT [Ticket ID], [Agent reply-time (seconds)] / 60 AS [Ticket creator wait time (minutes)] -- 445K in 2 sec, 445K in 3 sec, 418K in 5 sec
FROM [dwh].[Facts Response-statistics LVL 1 T]
WHERE [Response-statistics ID] IN (
SELECT MAX([Response-statistics ID]) -- 445K in 2 sec
FROM [dwh].[Facts Response-statistics LVL 1 T]
GROUP BY [Ticket ID] ) ) AS TCWT
ON T.[Ticket ID] = TCWT.[Ticket ID]
/* Agent wait time */
LEFT JOIN ( SELECT [Ticket ID], [Agent wait-time (seconds)] / 60 AS [Agent wait time (minutes)] -- 445K in 1 sec, 418K in 5 sec
FROM [dwh].[Facts Response-statistics LVL 1 T]
WHERE [Response-statistics ID] IN (
SELECT MIN([Response-statistics ID]) -- Flag: takes agent first wait, not last wait time
FROM [dwh].[Facts Response-statistics LVL 1 T]
GROUP BY [Ticket ID] ) ) AS AWT
ON T.[Ticket ID] = AWT.[Ticket ID]
/* Accumulated stats */
LEFT JOIN ( SELECT [Ticket ID], SUM([Agent reply-time (seconds)]) / 60 AS [Ticket creator total wait time (minutes)], SUM([Agent wait-time (seconds)]) / 60 AS [Agent total wait time (minutes)] --445K in 4 sec
FROM [dwh].[Facts Response-statistics LVL 1 T]
GROUP BY [Ticket ID]) AS ARS
ON T.[Ticket ID] = ARS.[Ticket ID]
/* ################## Reply queries ################## */
-- 85K in 20 sec
/* [Replies by Agent (count)]. 547K in 11 sec */
LEFT JOIN ( SELECT [Ticket ID], COUNT([Reply ID]) AS [Replies by Agent (count)] -- 575K in 3 sec
FROM [dwh].[Facts Replies LVL 1 T]
WHERE [By Agent (Yes/No)] = 'Yes'
GROUP BY [Ticket ID] ) AS RAC
ON T.[Ticket ID] = RAC.[Ticket ID]
/* [Replies by Ticket Creator (count)]. 377K in 33 sec */
LEFT JOIN ( SELECT [Ticket ID], COUNT([Reply ID]) AS [Replies by Ticket Creator (count)] -- 398K in 3 sec
FROM [dwh].[Facts Replies LVL 1 T]
WHERE [By Agent (Yes/No)] = 'No'
GROUP BY [Ticket ID] ) AS RATC
ON T.[Ticket ID] = RATC.[Ticket ID]
/* First Customer Reply */
LEFT JOIN ( SELECT [Ticket ID], [Creation (datetime)] AS [First Customer-reply (datetime)] -- 398K in 5 sec, 398K in 8 sec
FROM [dwh].[Facts Replies LVL 1 T]
WHERE [Reply ID] IN (
SELECT MIN([Reply ID]) -- 398K in 4 sec
FROM [dwh].[Facts Replies LVL 1 T]
WHERE [By Agent (1/0)] = 0
GROUP BY [Ticket ID] ) ) AS FCR
ON T.[Ticket ID] = FCR.[Ticket ID]
/* Last Customer Reply. 376K in 26 sec*/ --<<-- Bottleneck
LEFT JOIN ( SELECT [Ticket ID], [Creation (datetime)] AS [Last Customer-reply (datetime)], [Due (datetime)] -- 398K in 5 sec, 8 sec
FROM [dwh].[Facts Replies LVL 1 T]
WHERE [Reply ID] IN (
SELECT MAX([Reply ID]) -- 398 in 4 sec
FROM [dwh].[Facts Replies LVL 1 T]
WHERE [By Agent (1/0)] = 0
GROUP BY [Ticket ID] ) ) AS LCR
ON T.[Ticket ID] = LCR.[Ticket ID]
/* First Agent Reply */
LEFT JOIN ( SELECT [Ticket ID], [Creation (datetime)] AS [First Agent-reply (datetime)] -- 6 sec, 9 sec
FROM [dwh].[Facts Replies LVL 1 T]
WHERE [Reply ID] IN (
SELECT MIN([Reply ID]) -- 575K in 4 sec, 550K in 12 sec
FROM [dwh].[Facts Replies LVL 1 T]
WHERE [By Agent (1/0)] = 1
GROUP BY [Ticket ID] ) ) AS FAR
ON T.[Ticket ID] = FAR.[Ticket ID]
/* Last Agent Reply */
LEFT JOIN ( SELECT [Ticket ID], [Reply User-ID] AS [Agent User ID], [Creation (datetime)] AS [Updated (datetime)], [Creation (datetime)] AS [Last Agent-reply (datetime)] -- 573K in 9 sec, 9 sec
FROM [dwh].[Facts Replies LVL 1 T]
WHERE [Reply ID] IN (
SELECT MAX([Reply ID]) -- 575K in 4 sec
FROM [dwh].[Facts Replies LVL 1 T]
WHERE [By Agent (1/0)] = 1
GROUP BY [Ticket ID] ) ) AS LAR
ON T.[Ticket ID] = LAR.[Ticket ID]
/* ################## Action queries ################## */
/* First 'Assigned the ticket to' Action. 12K in 0 sec */
LEFT JOIN ( SELECT [Ticket ID], MIN([Creation (datetime)]) AS [Initially assigned (datetime)] -- 5K in 1 sec
FROM [dwh].[Facts Actions LVL 1 T]
WHERE [Action Type ID] = 28 /* Assigned the ticket to */
GROUP BY [Ticket ID]) AS FAA
ON T.[Ticket ID] = FAA.[Ticket ID]
/* Last 'Assigned the ticket to' Action. 12K in 0 sec*/
LEFT JOIN ( SELECT [Ticket ID], MAX([Creation (datetime)]) as [Assigned (datetime)] -- 5K in 1 sec ----------- Artur
FROM [dwh].[Facts Actions LVL 1 T]
WHERE [Action Type ID] = 28 /* Assigned the ticket to */
GROUP BY [Ticket ID]) AS LAA
ON T.[Ticket ID] = LAA.[Ticket ID]
/* First 'Completed this ticket' Action. 504K in 8 sec. */
LEFT JOIN ( SELECT [Ticket ID], MIN([Creation (datetime)]) AS [First Completion time (datetime)] -- 534K in 4 sec
FROM [dwh].[Facts Actions LVL 1 T]
WHERE [Action Type ID] = 12 and [Action Type Value] = 'Completed' /* Completed this ticket */
GROUP BY [Ticket ID]) AS FCT
ON T.[Ticket ID] = FCT.[Ticket ID]
/* Last 'Completed this ticket' Action. 504K in 8 sec. */
LEFT JOIN ( SELECT [Ticket ID], MAX([Creation (datetime)]) AS [Status updated (datetime)], MAX([Creation (datetime)]) AS [Solved (datetime)], MAX([Creation (datetime)]) AS [Completion time (datetime)] -- 534K in 8 sec
FROM [dwh].[Facts Actions LVL 1 T]
WHERE [Action Type ID] = 12 and [Action Type Value] = 'Completed' /* Completed this ticket */
GROUP BY [Ticket ID]) AS LCT
ON T.[Ticket ID] = LCT.[Ticket ID]
/* [Agent touches (count)]. 558K in 6 sec. */
LEFT JOIN ( SELECT [Ticket ID], COALESCE(COUNT(*),0) AS [Agent touches (count)] -- 616K in 5 sec
FROM [dwh].[Facts Actions LVL 1 T]
WHERE [Action User is Agent] = 'Yes'
GROUP BY [Ticket ID]) AS ATC
ON T.[Ticket ID] = ATC.[Ticket ID]
/* Reopens (count) 504K in 7 sec. */
LEFT JOIN ( SELECT [Ticket ID], COUNT(*) -1 AS [Reopens (count)] -- 7K in 1 sec
FROM [dwh].[Facts Actions LVL 1 T]
WHERE [Action Type ID] = 12 and [Action Type Value] = 'Open'
GROUP BY [Ticket ID]
HAVING count([Ticket ID]) > 1 ) AS AL2
ON T.[Ticket ID] = AL2.[Ticket ID]
WHERE
YEAR([Created (datetime)]) = 2018
答案 0 :(得分:1)
这对我来说太愚蠢了。您只报告是否为全部的MAX([Ticket Satisfaction ID])
票证。为什么不报告该票的最大值?减少工作量和信息量。对所有连接使用相同的模式。
LEFT JOIN ( SELECT [Ticket ID], max([Satisfaction, scored]) as max
FROM [dwh].[Facts Ticket Satisfactions LVL 1 V]
GROUP BY [Ticket ID]
) as TS
ON T.[Ticket ID] = TS.[Ticket ID]
如果您只想要最大值
LEFT JOIN ( SELECT [Ticket ID], [Satisfaction, scored])
, DENSE_RANK() over (partition by [Ticket ID]
order by [Satisfaction, scored] desc) as dr
FROM [dwh].[Facts Ticket Satisfactions LVL 1 V]
) as TS
ON T.[Ticket ID] = TS.[Ticket ID]
AND TS.dr = 1
答案 1 :(得分:1)
如何在同一主表上最佳地加入10多个表
我可以看到你的派生表有点复杂.SQLserver使用统计数据来选择一个计划,当你像你一样加入多个表时,估计可能会关闭..
所以我建议,使用临时表,对其进行索引,然后运行查询
示例:
SELECT [Ticket ID], [Creation (datetime)] AS [First Customer-reply (datetime)] -- 398K in 5 sec, 398K in 8 sec
FROM [dwh].[Facts Replies LVL 1 T]
WHERE [Reply ID] IN (
SELECT MIN([Reply ID]) -- 398K in 4 sec
FROM [dwh].[Facts Replies LVL 1 T]
WHERE [By Agent (1/0)] = 0
GROUP BY [Ticket ID] )
上面的查询输出应插入临时表中,应使用连接键作为引导列进行索引。
另外,我可以看到大多数查询都是以下形式..
SELECT [Ticket ID], [Satisfaction, scored] -- 317 0 sec, 317 1 sec
FROM [dwh].[Facts Ticket Satisfactions LVL 1 V]
WHERE [Ticket Satisfaction ID] IN (
SELECT MAX([Ticket Satisfaction ID])
FROM [dwh].[Facts Ticket Satisfactions LVL 1 V]
GROUP BY [Ticket ID] )
您不需要访问两次表,您可以使用窗口函数..示例查询,用于上面的查询
;with cte
as
(
select row_number() over (partition by ticketid order by [Ticket Satisfaction ID]) as rn
from
table
)
select * from cte where rn=1
您可以从我的示例查询中删除*
并将其编入索引,以便查询执行良好
create index NCI_tcktid_trnsfrmid on table(ticketid,[Satisfaction ID])
include(somecolumns you need)
答案 2 :(得分:0)
使用一个存储过程,使用在4-6个连接的不同级别的临时表之间传播计算,子查询使用在所有计算中共享的临时表,大大减少了查询从看似不确定的计算时间到计算时间大约10-15秒。