加速SQL查询

时间:2011-10-07 19:49:01

标签: sql sql-server tsql sql-server-2008

我有一个查询,它需要花费一些时间来执行比过去更早的任何事情,比如数小时的数据。这将创建一个将用于数据挖掘的视图,因此期望它能够搜索数周或数月的数据并在合理的时间内返回(甚至几分钟就好了......我跑了10/3/2011 12:00pm10/3/2011 1:00pm的日期范围,花了44分钟!)

问题在于底部的两个LEFT OUTER JOIN。当我拿出它们时,它可以在大约10秒内运行。然而,这些是这个查询的面包和黄油。

这一切都来自一张桌子。此查询返回的唯一方式与原始表不同的是列xweb_rangexweb_range是计算字段列(范围),仅使用[LO,LC,RO,RC]_Avg中相应[LO,LC,RO,RC]_Sensor_Alarm = 0的值(如果传感器警报= 1,则不包括在范围计算中)

WITH Alarm (sub_id, 
LO_Avg, LO_Sensor_Alarm, LC_Avg, LC_Sensor_Alarm, RO_Avg, RO_Sensor_Alarm, RC_Avg, RC_Sensor_Alarm) AS (
SELECT sub_id, LO_Avg, LO_Sensor_Alarm, LC_Avg, LC_Sensor_Alarm, RO_Avg, RO_Sensor_Alarm, RC_Avg, RC_Sensor_Alarm 
FROM dbo.some_table
where sub_id <> '0'
)
, AddRowNumbers AS (
SELECT  rowNumber = ROW_NUMBER() OVER (ORDER BY LO_Avg)
    , sub_id
    , LO_Avg, LO_Sensor_Alarm
    , LC_Avg, LC_Sensor_Alarm
    , RO_Avg, RO_Sensor_Alarm
    , RC_Avg, RC_Sensor_Alarm
FROM Alarm
)
, UnPivotColumns AS (
SELECT rowNumber, value = LO_Avg FROM AddRowNumbers WHERE LO_Sensor_Alarm = 0
UNION ALL SELECT rowNumber, LC_Avg FROM AddRowNumbers WHERE LC_Sensor_Alarm = 0
UNION ALL SELECT rowNumber, RO_Avg FROM AddRowNumbers WHERE RO_Sensor_Alarm = 0
UNION ALL SELECT rowNumber, RC_Avg FROM AddRowNumbers WHERE RC_Sensor_Alarm = 0
)
SELECT rowNumber.sub_id
   , cds.equipment_id
   , cds.read_time
   , cds.LC_Avg
   , cds.LC_Dev
   , cds.LC_Ref_Gap
   , cds.LC_Sensor_Alarm
   , cds.LO_Avg
   , cds.LO_Dev
   , cds.LO_Ref_Gap
   , cds.LO_Sensor_Alarm
   , cds.RC_Avg
   , cds.RC_Dev
   , cds.RC_Ref_Gap
   , cds.RC_Sensor_Alarm
   , cds.RO_Avg
   , cds.RO_Dev
   , cds.RO_Ref_Gap
   , cds.RO_Sensor_Alarm
   , COALESCE(range1.range, range2.range) AS xweb_range
FROM   AddRowNumbers rowNumber
   LEFT OUTER JOIN (SELECT rowNumber, range = MAX(value) - MIN(value) FROM UnPivotColumns GROUP BY rowNumber HAVING COUNT(*) > 1) range1 ON range1.rowNumber = rowNumber.rowNumber
   LEFT OUTER JOIN (SELECT rowNumber, range = AVG(value) FROM UnPivotColumns     GROUP BY rowNumber HAVING COUNT(*) = 1) range2 ON range2.rowNumber = rowNumber.rowNumber
   INNER JOIN dbo.some_table cds
   ON rowNumber.sub_id = cds.sub_id

1 个答案:

答案 0 :(得分:2)

如果不了解域名,很难准确理解您的查询尝试做什么。但是,在我看来,您的查询只是试图找到dbo.some_tablesub_id不为0的每一行,记录中以下列的范围(或者,如果只有一个匹配) ,那个单一的价值):

    当LO_SENSOR_ALARM = 0 时,
  • LO_AVG 当LC_SENSOR_ALARM = 0
  • 时,
  • LC_AVG 当RO_SENSOR_ALARM = 0
  • 时,
  • RO_AVG 当RC_SENSOR_ALARM = 0
  • 时,
  • RC_AVG

您构造了此查询,为每一行分配一个连续的行号,忽略_AVG列及其行号,按行号计算范围聚合分组,然后按行号连接回原始记录。 CTE没有实现结果(也没有将其编入索引,如评论中所述)。因此,对AddRowNumbers的每次引用都很昂贵,因为ROW_NUMBER() OVER (ORDER BY LO_Avg)是一种排序。

为了不按行号将它连接起来而不是切割这张表,为什么不这样做:

SELECT cds.sub_id
   , cds.equipment_id
   , cds.read_time
   , cds.LC_Avg
   , cds.LC_Dev
   , cds.LC_Ref_Gap
   , cds.LC_Sensor_Alarm
   , cds.LO_Avg
   , cds.LO_Dev
   , cds.LO_Ref_Gap
   , cds.LO_Sensor_Alarm
   , cds.RC_Avg
   , cds.RC_Dev
   , cds.RC_Ref_Gap
   , cds.RC_Sensor_Alarm
   , cds.RO_Avg
   , cds.RO_Dev
   , cds.RO_Ref_Gap
   , cds.RO_Sensor_Alarm

   --if the COUNT is 0, xweb_range will be null (since MAX will be null), if it's 1, then use MAX, else use MAX - MIN (as per your example)
   , (CASE WHEN stats.[Count] < 2 THEN stats.[MAX] ELSE stats.[MAX] - stats.[MIN] END) xweb_range

FROM dbo.some_table cds

    --cross join on the following table derived from values in cds - it will always contain 1 record per row of cds
    CROSS APPLY
    (
        SELECT COUNT(*), MIN(Value), MAX(Value)
        FROM
        (
            --construct a table using the column values from cds we wish to aggregate
            VALUES (LO_AVG, LO_SENSOR_ALARM),
                   (LC_AVG, LC_SENSOR_ALARM),
                   (RO_AVG, RO_SENSORALARM),
                   (RC_AVG, RC_SENSOR_ALARM)


        ) x (Value, Sensor_Alarm) --give a name to the columns for _AVG and _ALARM
        WHERE Sensor_Alarm = 0 --filter our constructed table where _ALARM=0

    ) stats([Count], [Min], [Max]) --give our derived table and its columns some names

WHERE cds.sub_id <> '0' --this is a filter carried over from the first CTE in your example