使用函数改进SQL查询

时间:2011-09-25 12:13:13

标签: sql tsql sql-server-2008

我试图通过使用SQL Server Express 2008中的函数来提高查询可读性。 以下是我正在尝试做的一个示例。

我有一个表格,我们存储当天每小时的最高温度读数,然后我想选择8-10AM之间的最高温度大于12-2PM之间的最高温度的所有日子

所以这就是它的样子:

DECLARE @TableA TABLE ([Date] DATE, [Time] TIME(0), HighTemp DECIMAL(6,2)); 

INSERT @TableA VALUES 
('2011-09-10','08:00:00',38.15), 
('2011-09-10','09:00:00',38.32), 
('2011-09-10','10:00:00',38.17), 
('2011-09-10','11:00:00',38.10), 
('2011-09-10','12:00:00',38.05), 
('2011-09-10','13:00:00',38.15), 
('2011-09-10','14:00:00',38.30), 

('2011-09-11','08:00:00',38.12), 
('2011-09-11','09:00:00',38.09), 
('2011-09-11','10:00:00',38.07), 
('2011-09-11','11:00:00',38.15), 
('2011-09-11','12:00:00',38.13), 
('2011-09-11','13:00:00',38.11), 
('2011-09-11','14:00:00',38.10), 

('2011-09-12','08:00:00',38.30), 
('2011-09-12','09:00:00',38.33), 
('2011-09-12','10:00:00',38.35), 
('2011-09-12','11:00:00',38.30), 
('2011-09-12','12:00:00',38.25), 
('2011-09-12','13:00:00',38.23), 
('2011-09-12','14:00:00',38.20)

select distinct [DATE] from @TableA maintbl
where 
-- Select the high temp between 08:00:00-10:00:00
(select MAX(HighTemp) from @TableA tmptbl where tmptbl.Time >= '08:00:00' and tmptbl.Time <= '10:00:00' and maintbl.Date = tmptbl.Date)
>
-- Select the high between 12:00:00-14:00:00
(select MAX(HighTemp) from @TableA tmptbl where tmptbl.Time >= '12:00:00' and tmptbl.Time <= '14:00:00' and maintbl.Date = tmptbl.Date)

查询运行良好(快速),上述查询的结果应为: 2011-09-10 2011-09-12

现在,我尝试使用一个函数来简化查询,该函数检索特定日期和时间段的最大温度,因此查询更容易阅读,如下所示:

select distinct [DATE] from @TableA maintbl
where GetPeriodHigh(maintbl.Date, '08:00:00', '10:00:00') > GetPeriodHigh(maintbl.Date, '12:00:00', '14:00:00')

功能如下:

CREATE FUNCTION [dbo].[GetPeriodHigh] 
(
    @Date date,
    @From time,
    @To time
)
RETURNS decimal(6,2)
AS
BEGIN

    declare @res decimal(6,2)

    select @res = MAX(high) from MyTable
    where Time >= @from and Time <= @to and Date = @Date

    return @res
END

问题是使用该函数运行查询需要LOOONG时间,实际上我从未看到它完成,看起来它处于某种无限循环中......

任何想法都是为什么,我可以做些什么来简化我的查询?

THX。

2 个答案:

答案 0 :(得分:5)

执行数据访问的标量函数通常很糟糕,最好避免使用。它们不会被优化器扩展,这基本上强制函数查询作为嵌套循环连接的内部,而不管其是否合适。

更糟糕的是,您可能没有正确的索引来评估函数内的Time >= @from and Time <= @to and Date = @Date谓词,这意味着对于外部查询中的每一行,您通过函数调用强制执行2次表扫描。

源代码示例中也存在缺少索引的情况,使用内联版本可以看出,查询优化器能够有效地将其重写为两个MAX / GROUP BY个不同的查询然后WHERE子句合并将结果连接在一起。当逻辑在标量UDF中时,目前不考虑这种转换。

Plan

您可以尝试的另一种方法是

SELECT [Date]
FROM @TableA
WHERE Time BETWEEN '08:00:00' AND '10:00:00' 
      OR Time BETWEEN '12:00:00' AND '14:00:00'
GROUP BY [Date]
HAVING MAX(CASE 
               WHEN Time BETWEEN '08:00:00' AND '10:00:00' THEN HighTemp END) > 
       MAX(CASE 
               WHEN Time BETWEEN '12:00:00' AND '14:00:00' THEN HighTemp END)

答案 1 :(得分:2)