我正在寻找一种优化SELECT查询的方法,在日期维度表和事实表之间使用左连接,这些连接必须显示2014年度量的总和。
以下是查询:
select SUM(coalesce(f.NBSCANS,0)) as somme
from DIM_DATE as d
left join FCT_SCAN as f
on d.DATE = CAST(f.DATE_HEURE as DATE)
and CAST(d.HEURE as varchar(4)) = CAST(CAST(f.DATE_HEURE as time) as varchar(4))
where d.ANNEE = 2014
此查询太慢,因为我从未见过结果。 如果我在月份添加一个WHERE子句(例如:d.MOIS = 11)则需要1分钟(所以有点长)。
但是如果我在当天添加一个WHERE子句,结果会在4秒内显示:
select SUM(coalesce(f.NBSCANS,0)) as somme
from DIM_DATE as d
left join FCT_SCAN as f
on d.DATE = CAST(f.DATE_HEURE as DATE)
and CAST(d.HEURE as varchar(4)) = CAST(CAST(f.DATE_HEURE as time) as varchar(4))
where d.ANNEE = 2014
and d.MOIS = 11
and d.JOUR = 5
有关信息,请参阅DIM_DATE的CREATE TABLE脚本:
CREATE TABLE [dbo].[DIM_DATE](
[DATE_HEURE] [datetime] NOT NULL,
[ANNEE] [int] NULL,
[MOIS] [int] NULL,
[JOUR] [int] NULL,
[DATE] [date] NULL,
[JOUR_SEM_DATE] [varchar](10) NULL,
[NUM_JOUR_SEM_DATE] [int] NULL,
[HEURE] [time](0) NULL,
[TRANCHE_1H] [time](0) NULL,
[TRANCHE_DEMIH] [time](0) NULL,
[TRANCHE_QUARTH] [time](0) NULL,
[TRANCHE_10M] [time](0) NULL,
CONSTRAINT [PK_DIM_DATE] PRIMARY KEY NONCLUSTERED
(
[DATE_HEURE] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
FCT_SCAN中的DATE_HEURE字段与DIM_DATE中的相同。
在DIM_DATE中,每10分钟有一条记录:
DATE_HEURE
2015-06-17 12:00:00.000
2015-06-17 12:10:00.000
2015-06-17 12:20:00.000
2015-06-17 12:30:00.000
2015-06-17 12:40:00.000
2015-06-17 12:50:00.000
2015-06-17 13:00:00.000
2015-06-17 13:10:00.000
2015-06-17 13:20:00.000
2015-06-17 13:30:00.000
所以我的问题如下:如何优化这个查询知道我必须保持LEFT JOIN? (对于Cognos包)
编辑:这是执行计划。
|--Compute Scalar(DEFINE:([Expr1006]=CASE WHEN [globalagg1013]=(0) THEN NULL ELSE [globalagg1015] END))
|--Stream Aggregate(DEFINE:([globalagg1013]=SUM([partialagg1012]), [globalagg1015]=SUM([partialagg1014])))
|--Parallelism(Gather Streams)
|--Stream Aggregate(DEFINE:([partialagg1012]=COUNT_BIG([Expr1007]), [partialagg1014]=SUM([Expr1007])))
|--Compute Scalar(DEFINE:([Expr1007]=CASE WHEN [DECIS_DM_PARCOURS_PAX].[dbo].[FCT_SCAN].[NBSCANS] as [f].[NBSCANS] IS NOT NULL THEN CONVERT_IMPLICIT(int,[DECIS_DM_PARCOURS_PAX].[dbo].[FCT_SCAN].[NBSCANS] as [f].[NBSCANS],0) ELSE (0) END))
|--Nested Loops(Left Outer Join, OUTER REFERENCES:([d].[DATE], [Expr1009]))
|--Compute Scalar(DEFINE:([Expr1009]=CONVERT(varchar(4),[DECIS_DM_PARCOURS_PAX].[dbo].[DIM_DATE].[HEURE] as [d].[HEURE],121)))
| |--Table Scan(OBJECT:([DECIS_DM_PARCOURS_PAX].[dbo].[DIM_DATE] AS [d]), WHERE:([DECIS_DM_PARCOURS_PAX].[dbo].[DIM_DATE].[ANNEE] as [d].[ANNEE]=(2014)))
|--Nested Loops(Inner Join, OUTER REFERENCES:([Bmk1003]) OPTIMIZED)
|--Compute Scalar(DEFINE:([Expr1021]=BmkToPage([Bmk1003])))
| |--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1019], [Expr1020], [Expr1018]))
| |--Compute Scalar(DEFINE:(([Expr1019],[Expr1020],[Expr1018])=GetRangeThroughConvert([DECIS_DM_PARCOURS_PAX].[dbo].[DIM_DATE].[DATE] as [d].[DATE],[DECIS_DM_PARCOURS_PAX].[dbo].[DIM_DATE].[DATE] as [d].[DATE],(62))))
| | |--Constant Scan
| |--Index Seek(OBJECT:([DECIS_DM_PARCOURS_PAX].[dbo].[FCT_SCAN].[IDX_DATE_HEURE] AS [f]), SEEK:([f].[DATE_HEURE] > [Expr1019] AND [f].[DATE_HEURE] < [Expr1020]), WHERE:([DECIS_DM_PARCOURS_PAX].[dbo].[DIM_DATE].[DATE] as [d].[DATE]=CONVERT(date,[DECIS_DM_PARCOURS_PAX].[dbo].[FCT_SCAN].[DATE_HEURE] as [f].[DATE_HEURE],0) AND [Expr1009]=CONVERT(varchar(4),CONVERT(time(7),[DECIS_DM_PARCOURS_PAX].[dbo].[FCT_SCAN].[DATE_HEURE] as [f].[DATE_HEURE],0),121)) ORDERED FORWARD)
|--RID Lookup(OBJECT:([DECIS_DM_PARCOURS_PAX].[dbo].[FCT_SCAN] AS [f]), SEEK:([Bmk1003]=[Bmk1003]) LOOKUP ORDERED FORWARD)
答案 0 :(得分:1)
试试这个。它应该提高性能。您可以通过在表格FCT_SCAN中创建计算列PERSISTED来进一步改进它。这将允许使用索引。
SELECT
coalesce(SUM(f.NBSCANS),0) as somme
FROM
DIM_DATE as d
LEFT JOIN
FCT_SCAN as f
on
f.DATE_HEURE>= d.DATE_HEURE
and f.DATE_HEURE < dateadd(minute, 10, d.DATE_HEURE)
WHERE
d.ANNEE = 2014
and d.MOIS = 11
and d.JOUR = 5
答案 1 :(得分:1)
感谢执行计划,我找到了解决方案。它是FCT_SCAN的DATE_HEURE字段的索引,它在查询中花了很多钱,所以我删除了它。 现在执行时间大约是几秒钟。
感谢大家的建议!
答案 2 :(得分:0)
非等联合总是很糟糕。
您应该更改逻辑:将f.DATE_HEURE截断为0/10/20/30/40/50分钟,然后加入。
select SUM(coalesce(f.NBSCANS,0)) as somme
from DIM_DATE as d
left join
( select
DATEADD(minute, DATEDIFF(minute, 0, DATE_HEURE) / 10 * 10, 0) as x
, ...
from FCT_SCAN
) as f
on d.DATE = f.x
where d.ANNEE = 2014
在FCT_SCAN.DATE_HEURE上添加类似条件以限制到同一时期(例如2014年)也可能会有所帮助。