使用左连接在SQL Server中查询速度慢

时间:2015-02-13 10:17:44

标签: sql sql-server date optimization left-join

我正在寻找一种优化SELECT查询的方法,在日期维度表和事实表之间使用左连接,这些连接必须显示2014年度量的总和。

以下是查询:

select SUM(coalesce(f.NBSCANS,0)) as somme
from DIM_DATE as d 
left join FCT_SCAN as f
on d.DATE = CAST(f.DATE_HEURE as DATE)
and CAST(d.HEURE as varchar(4)) = CAST(CAST(f.DATE_HEURE as time) as varchar(4))
where d.ANNEE = 2014

此查询太慢,因为我从未见过结果。 如果我在月份添加一个WHERE子句(例如:d.MOIS = 11)则需要1分钟(所以有点长)。

但是如果我在当天添加一个WHERE子句,结果会在4秒内显示:

select SUM(coalesce(f.NBSCANS,0)) as somme
from DIM_DATE as d 
left join FCT_SCAN as f
on d.DATE = CAST(f.DATE_HEURE as DATE)
and CAST(d.HEURE as varchar(4)) = CAST(CAST(f.DATE_HEURE as time) as varchar(4))
where d.ANNEE = 2014
and d.MOIS = 11
and d.JOUR = 5

有关信息,请参阅DIM_DATE的CREATE TABLE脚本:

CREATE TABLE [dbo].[DIM_DATE](
    [DATE_HEURE] [datetime] NOT NULL,
    [ANNEE] [int] NULL,
    [MOIS] [int] NULL,
    [JOUR] [int] NULL,
    [DATE] [date] NULL,
    [JOUR_SEM_DATE] [varchar](10) NULL,
    [NUM_JOUR_SEM_DATE] [int] NULL,
    [HEURE] [time](0) NULL,
    [TRANCHE_1H] [time](0) NULL,
    [TRANCHE_DEMIH] [time](0) NULL,
    [TRANCHE_QUARTH] [time](0) NULL,
    [TRANCHE_10M] [time](0) NULL,
 CONSTRAINT [PK_DIM_DATE] PRIMARY KEY NONCLUSTERED 
(
    [DATE_HEURE] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

FCT_SCAN中的DATE_HEURE字段与DIM_DATE中的相同。

在DIM_DATE中,每10分钟有一条记录:

DATE_HEURE
2015-06-17 12:00:00.000
2015-06-17 12:10:00.000
2015-06-17 12:20:00.000
2015-06-17 12:30:00.000
2015-06-17 12:40:00.000
2015-06-17 12:50:00.000
2015-06-17 13:00:00.000
2015-06-17 13:10:00.000
2015-06-17 13:20:00.000
2015-06-17 13:30:00.000

所以我的问题如下:如何优化这个查询知道我必须保持LEFT JOIN? (对于Cognos包)

编辑:这是执行计划。

|--Compute Scalar(DEFINE:([Expr1006]=CASE WHEN [globalagg1013]=(0) THEN NULL ELSE [globalagg1015] END))
   |--Stream Aggregate(DEFINE:([globalagg1013]=SUM([partialagg1012]), [globalagg1015]=SUM([partialagg1014])))
        |--Parallelism(Gather Streams)
             |--Stream Aggregate(DEFINE:([partialagg1012]=COUNT_BIG([Expr1007]), [partialagg1014]=SUM([Expr1007])))
                  |--Compute Scalar(DEFINE:([Expr1007]=CASE WHEN [DECIS_DM_PARCOURS_PAX].[dbo].[FCT_SCAN].[NBSCANS] as [f].[NBSCANS] IS NOT NULL THEN CONVERT_IMPLICIT(int,[DECIS_DM_PARCOURS_PAX].[dbo].[FCT_SCAN].[NBSCANS] as [f].[NBSCANS],0) ELSE (0) END))
                       |--Nested Loops(Left Outer Join, OUTER REFERENCES:([d].[DATE], [Expr1009]))
                            |--Compute Scalar(DEFINE:([Expr1009]=CONVERT(varchar(4),[DECIS_DM_PARCOURS_PAX].[dbo].[DIM_DATE].[HEURE] as [d].[HEURE],121)))
                            |    |--Table Scan(OBJECT:([DECIS_DM_PARCOURS_PAX].[dbo].[DIM_DATE] AS [d]), WHERE:([DECIS_DM_PARCOURS_PAX].[dbo].[DIM_DATE].[ANNEE] as [d].[ANNEE]=(2014)))
                            |--Nested Loops(Inner Join, OUTER REFERENCES:([Bmk1003]) OPTIMIZED)
                                 |--Compute Scalar(DEFINE:([Expr1021]=BmkToPage([Bmk1003])))
                                 |    |--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1019], [Expr1020], [Expr1018]))
                                 |         |--Compute Scalar(DEFINE:(([Expr1019],[Expr1020],[Expr1018])=GetRangeThroughConvert([DECIS_DM_PARCOURS_PAX].[dbo].[DIM_DATE].[DATE] as [d].[DATE],[DECIS_DM_PARCOURS_PAX].[dbo].[DIM_DATE].[DATE] as [d].[DATE],(62))))
                                 |         |    |--Constant Scan
                                 |         |--Index Seek(OBJECT:([DECIS_DM_PARCOURS_PAX].[dbo].[FCT_SCAN].[IDX_DATE_HEURE] AS [f]), SEEK:([f].[DATE_HEURE] > [Expr1019] AND [f].[DATE_HEURE] < [Expr1020]),  WHERE:([DECIS_DM_PARCOURS_PAX].[dbo].[DIM_DATE].[DATE] as [d].[DATE]=CONVERT(date,[DECIS_DM_PARCOURS_PAX].[dbo].[FCT_SCAN].[DATE_HEURE] as [f].[DATE_HEURE],0) AND [Expr1009]=CONVERT(varchar(4),CONVERT(time(7),[DECIS_DM_PARCOURS_PAX].[dbo].[FCT_SCAN].[DATE_HEURE] as [f].[DATE_HEURE],0),121)) ORDERED FORWARD)
                                 |--RID Lookup(OBJECT:([DECIS_DM_PARCOURS_PAX].[dbo].[FCT_SCAN] AS [f]), SEEK:([Bmk1003]=[Bmk1003]) LOOKUP ORDERED FORWARD)

3 个答案:

答案 0 :(得分:1)

试试这个。它应该提高性能。您可以通过在表格FCT_SCAN中创建计算列PERSISTED来进一步改进它。这将允许使用索引。

SELECT 
  coalesce(SUM(f.NBSCANS),0) as somme
FROM
  DIM_DATE as d 
LEFT JOIN
  FCT_SCAN as f
on 
  f.DATE_HEURE>= d.DATE_HEURE
  and f.DATE_HEURE < dateadd(minute, 10, d.DATE_HEURE)
WHERE
  d.ANNEE = 2014
  and d.MOIS = 11
  and d.JOUR = 5

答案 1 :(得分:1)

感谢执行计划,我找到了解决方案。它是FCT_SCAN的DATE_HEURE字段的索引,它在查询中花了很多钱,所以我删除了它。 现在执行时间大约是几秒钟。

感谢大家的建议!

答案 2 :(得分:0)

非等联合总是很糟糕。

您应该更改逻辑:将f.DATE_HEURE截断为0/10/20/30/40/50分钟,然后加入。

select SUM(coalesce(f.NBSCANS,0)) as somme
from DIM_DATE as d 
left join 
  ( select
       DATEADD(minute, DATEDIFF(minute, 0, DATE_HEURE) / 10 * 10, 0) as x
      , ...
    from FCT_SCAN
  ) as f
on d.DATE = f.x
where d.ANNEE = 2014

在FCT_SCAN.DATE_HEURE上添加类似条件以限制到同一时期(例如2014年)也可能会有所帮助。