使用JOIN和RANK()查询的性能问题

时间:2014-10-07 15:58:34

标签: sql sql-server sql-server-2012

我编写了以下SQL命令:

SELECT *,  COALESCE (def.route_step, 'Keine Fehlerinformation') as 'Ausfallort'
FROM QS_WIP_Errors err
LEFT JOIN (
    SELECT * FROM
    (
        SELECT DISTINCT
               inspect_time, repair_time, serial_number, station, route_step,
               rank() over (partition by def.serial_number order by inspect_time desc) as [Rang]
        FROM dbo.View_QS_DEFECTS_Stammdaten def
        WHERE route_step NOT LIKE 'Analyse'
    ) AS def WHERE rang=1) as def
ON err.SERIAL_NUMBER = def.serial_number
WHERE err.state = 2
  AND err.ENDTIME >= '2014-10-06 06:00:00.000' 
  AND err.ENDTIME <= '2014-10-07 06:00:00.000'

我打算做的是:

    来自QS_WIP_Errors
  • :获取state = 2;
  • 的所有记录
  • 指定的时间段;
  • 通过属性serial_number将这些结果与视图dbo.View_QS_Defects_Stammdaten中的相应记录相结合;
  • COALESCE:如果“JOIN找不到匹配项”,则显示&{39; Keine Fehlerinformationen;
  • 来自dbo.View_QS_DEFECTS_Stammdaten
  • :获取每个def.serial_number的最新记录rang=1;
  • 除了最后一个route_stepAnalyse

上面的查询执行了它设计的所有内容 - 但它并没有在所需的时间内完成(我在30分钟后中止了它)。

奇怪的是,当我跳过时间限制(即行AND err.ENDTIME >= '2014-10-06 06:00:00.000' AND err.ENDTIME <= '2014-10-07 06:00:00.000')时,查询会在几秒钟内执行(即根据需要)。

到目前为止我尝试过改进性能:

  • view_QS_DEFECTS_Stammdaten替换为原始表格;
  • 而不是查询所有列(*)而只是选择一些单列 - &GT;没有任何改进

任何人都可以给我一个如何改善表现的暗示吗?

非常感谢! 我正在使用SQL Server 2012。

2 个答案:

答案 0 :(得分:3)

这是您的查询:

select *,  COALESCE(def.route_step, 'Keine Fehlerinformation') as Ausfallort
from QS_WIP_Errors err left join
     (select *
      from (select distinct inspect_time, repair_time, serial_number, station, route_step,
                   rank() over (partition by def.serial_number order by inspect_time desc) as [Rang]
            from dbo.View_QS_DEFECTS_Stammdaten def
            where route_step not like 'Analyse'
           ) as def
      where rang = 1
     ) as def
     on err.SERIAL_NUMBER = def.serial_number
where err.state = 2 AND
      err.ENDTIME >= '2014-10-06 06:00:00.000' AND err.ENDTIME <= '2014-10-07 06:00:00.000';

一些观察结果:

  • 性能可能由底层视图驱动,因此在此级别可能无法完成任何操作。
  • distinct似乎没有必要。如果您只想要一行,则应使用row_number()
  • 来更具体
  • err上的索引会有所帮助。

建立索引QS_WIP_Errors(state, endtime, serial_number)并将查询编写为:

select *,  COALESCE(def.route_step, 'Keine Fehlerinformation') as Ausfallort
from QS_WIP_Errors err left join
     (select inspect_time, repair_time, serial_number, station, route_step,
             row_number() over (partition by def.serial_number order by inspect_time desc) as [Rang]
      from dbo.View_QS_DEFECTS_Stammdaten def
      where route_step not like 'Analyse'
     ) as def
     on err.SERIAL_NUMBER = def.serial_number and rang = 1
where err.state = 2 AND
      err.ENDTIME >= '2014-10-06 06:00:00.000' AND err.ENDTIME <= '2014-10-07 06:00:00.000';

您也可以尝试将其写为outer apply

select *, COALESCE(def.route_step, 'Keine Fehlerinformation') as Ausfallort
from QS_WIP_Errors err outer apply
     (select top 1 inspect_time, repair_time, serial_number, station, route_step
      from dbo.View_QS_DEFECTS_Stammdaten def
      where route_step not like 'Analyse' and err.SERIAL_NUMBER = def.serial_number
      order by inspect_time desc
     ) def
where err.state = 2 AND
      err.ENDTIME >= '2014-10-06 06:00:00.000' AND err.ENDTIME <= '2014-10-07 06:00:00.000';

有时apply方法会更好地优化。

答案 1 :(得分:1)

感谢@Gordon的回复。 首先,对于你使用“distinct”和“row number”的提示 - 你实际上是正确的,我上面的查询没有给我我想要的结果。 我用这种方式修改了它:

Select * from (
  select distinct err.SERIAL_NUMBER as 'Err_SERIAL_NUMBER', 
  err.ROUTE_STEP as 'Err_ROUTE_STEP', err.ENDTIME, 
  rank() over (partition by err.serial_Number order by err.endtime asc) as [Rank_err],
  def.*, COALESCE (def.route_step, 'Keine Fehlerinformation') as 'Ausfallort' 
  from QS_WIP_Errors err LEFT JOIN (
  select * from
  (
  select distinct inspect_time as 'DefectsInspectTime', serial_number, station, route_step,  
    rank() over (partition by def.serial_number order by def.inspect_time desc) as [Rank_Def]
    from dbo.View_QS_DEFECTS_Stammdaten def where route_step not like 'Analyse'
  ) as def where Rank_Def=1) as def
  on err.SERIAL_NUMBER = def.serial_number
  where err.state = 2 ) as tblJoin
where tblJoin.Rank_err = 1
AND tblJoin.ENDTIME >= '2014-10-07 06:00:00.000' AND tblJoin.ENDTIME <= '2014-10-08 06:00:00.000'

现在我真的得到了我想要的价值。并且作为一个很好的副作用,现在查询在几秒钟内执行。我无法解释它,但它解决了这个问题。这就是为什么我把它标记为答案