SQL Azure查询聚合性能问题

时间:2016-08-12 11:00:22

标签: sql sql-server performance tsql azure

我试图改进我们的SQL Azure数据库性能,试图改变CURSOR的使用,而这是(正如每个人都告诉我的)要避免的事情。

我们的表格是关于GPS信息,具有id聚集索引的行和设备上的二级索引,时间戳和位置上的地理索引。

我试图计算一些统计数据,例如特定设备的最小速度(多普勒和计算),总距离,平均速度......等。

我没有选择统计数据,因为生产而无法更改表格或输出。

在我的SQL Azure DB上运行此内联tbl函数时,我遇到了明显的性能问题。

ALTER FUNCTION [dbo].[fn_logMetrics_3] 
(   
    @p_device smallint, 
    @p_from dateTime,
    @p_to   dateTime,
    @p_moveThresold int = 1
)
RETURNS TABLE 
AS
    RETURN 
    (
         WITH CTE AS
         (
            SELECT  
                ROW_NUMBER() OVER(ORDER BY timestamp) AS RowNum,
                Timestamp,
                Location,
                Alt,
                Speed
            FROM 
                LogEvents
            WHERE 
                Device = @p_device 
                AND Timestamp >= @p_from 
                AND Timestamp <= @p_to),
        CTE1 AS
        (
            SELECT 
                t1.Speed as Speed,
                t1.Alt as Alt,
                t2.Alt - t1.Alt as DeltaElevation,
                t1.Timestamp as Time0,
                t2.Timestamp as Time1,
                DATEDIFF(second, t2.Timestamp, t1.Timestamp) as Duration,
                t1.Location.STDistance(t2.Location) as Distance
            FROM 
                CTE t1
            INNER JOIN 
                CTE t2 ON t1.RowNum = t2.RowNum + 1),
        CTE2 AS
        (
            SELECT 
                Speed, Alt,
                DeltaElevation,
                Time0, Time1,
                Duration,
                Distance,
                CASE 
                   WHEN Duration <> 0
                      THEN (Distance / Duration) * 3.6 
                      ELSE NULL 
                END AS CSpeed,
                CASE 
                   WHEN DeltaElevation > 0 
                      THEN DeltaElevation 
                      ELSE NULL 
                END As PositiveAscent,
                CASE
                   WHEN DeltaElevation < 0 
                      THEN DeltaElevation 
                      ELSE NULL 
                END As NegativeAscent,
                CASE 
                   WHEN Distance < @p_moveThresold 
                      THEN Duration 
                      ELSE NULL 
                END As StopTime,
                CASE 
                   WHEN Distance > @p_moveThresold 
                      THEN Duration 
                      ELSE NULL 
                END As MoveTime
            FROM 
                CTE1 t1
    )
    SELECT
        COUNT(*) as Count,
        MIN(Speed) as HSpeedMin, MAX(Speed) as HSpeedMax, 
        AVG(Speed) as HSpeedAverage,
        MIN(CSpeed) as CHSpeedMin, MAX(CSpeed) as CHSpeedMax, 
        AVG(CSpeed) as CHSpeedAverage,
        SUM(Distance) as CumulativeDistance, 
        MAX(Alt) as AltMin, MIN(Alt) as AltMax,
        SUM(PositiveAscent) as PositiveAscent,
        SUM(NegativeAscent) as NegativeAscent,
        SUM(StopTime) as StopTime,
        SUM(MoveTime) as MoveTime
    FROM 
        CTE2 t1
)

广泛的想法是

  • CTE正在按照参数
  • 选择相应的行
  • CTE1在两个连续的行内执行聚合,以获得持续时间和距离
  • 然后CTE2对这些距离和持续时间执行操作
  • 最后,最后一个选择是对每列进行聚合,例如总和和平均值

一切运行良好,直到最后一次SELECT调用,其中agregate函数(只有少数总和和平均值)使性能无效。

此查询针对具有4M行的表选择1500行,需要1500毫秒。

替换最后一个选择时
SELECT ÇOUNT(*) as count FROM CTE2 t1

然后它只需要几毫秒..(根据SQL Studio统计数据,下降到2毫秒)。

   SELECT
   COUNT(*) as Count,
   SUM(MoveTime) as MoveTime

它大约125ms

   SELECT
   COUNT(*) as Count,
   SUM(StopTime) as StopTime,
   SUM(MoveTime) as MoveTime

它大约250ms

就像每个聚合在所有行上的连续循环操作上运行,在同一个线程内并且没有并行化

有关信息,此功能的CURSOR版本(我在几年前写过)实际上至少运行了两次......

这个聚合有什么问题?如何优化它?

更新:

The query plans for SELECT COUNT(*) as Count

The query plans for the full Select with agregate

根据Joe C的回答,我在计划中引入了一个#tmp表并在其上执行聚合。结果大约快两倍,这是一个有趣的事实。

0 个答案:

没有答案