SQL Index Spool(急切的假脱机)加速查询

时间:2013-10-07 16:54:04

标签: sql sql-server performance sql-server-2008 stored-procedures

我有一个存储过程,我写了一段时间,以帮助生成一个XML文件,用于与外部资源共享数据。基本上,最终用户将数据转储到名为DataSharing的表中,然后当我们执行Query时,它将返回一个XML文档,其中包含DataSharing中指定的必需字段。现在这个程序确实运行良好,但它非常慢。当我通过SSMS运行并设置“显示实际执行计划”时,94%的查询将花费在索引假脱机(急切线轴)上。经过研究,看起来我应该重新编写查询以更好地执行。

由于数据列我永远不知道它们是什么,我必须做独特的数据透视才能生成我的数据。

以下是程序:

CREATE PROCEDURE [dbo].[sp_HPSDDataSharing]
    -- Add the parameters for the stored procedure here
    @fileName varchar(MAX), @StartDate datetime, @EndDate datetime
AS
BEGIN
    -- SET NOCOUNT ON added to prevent extra result sets from
    -- interfering with SELECT statements.
    SET NOCOUNT ON;

    DECLARE @sqlCommand varchar(MAX), @listStr VARCHAR(MAX)
    SELECT @listStr =
      COALESCE(@listStr +',' ,'') + '[' +  [ColumnName] + ']'
  FROM [FCPP_HPSD].[dbo].[DataSharing]
  WHERE FileName = @fileName
  DECLARE @Result XML
  SET @sqlCommand = 'Select * From ( SELECT 
      [DatapointDate]
      ,dp.ColumnName
      ,[DataPointValue]
    FROM [FCPP_HPSD].[dbo].[vw_DataCollection] DC
  JOIN [FCPP_HPSD].[dbo].[Datasharing] dp
  ON DC.DataPointID = DP.DatapointID
  WHERE 
  [DatapointDate] >= ''' + CONVERT(varchar(MAX), @StartDate) + '''
  and [DatapointDate] < ''' + CONVERT(varchar(MAX), @EndDate) + '''
  and  dc.DataPointID in (SELECT [DatapointID] FROM [FCPP_HPSD].[dbo].[DataSharing] Where FileName = ''' + @fileName + ''')
  ) AS source
    PIVOT
    (
        SUM(DataPointValue)
        FOR ColumnName IN ('+ @listStr +')
    ) as pvt
    ORDER BY DatapointDate
    FOR XML Path(''' + 'DataRow' + '''), ROOT;'

    Print @sqlCommand

    EXEC (@sqlCommand)

END



GO

完全执行的查询如下所示:

SELECT * 
FROM   (SELECT [datapointdate], 
               dp.columnname, 
               [datapointvalue] 
        FROM   [FCPP_HPSD].[dbo].[vw_datacollection] DC 
               JOIN [FCPP_HPSD].[dbo].[datasharing] dp 
                 ON DC.datapointid = DP.datapointid 
        WHERE  [datapointdate] >= 'Jul 15 2013 12:00AM' 
               AND [datapointdate] < 'Jul 22 2013 12:00AM' 
               AND dc.datapointid IN (SELECT [datapointid] 
                                      FROM   [FCPP_HPSD].[dbo].[datasharing] 
                                      WHERE  filename = 'fdrD3')) AS source 
       PIVOT ( Sum(datapointvalue) 
             FOR columnname IN ([fdrD3_kWh_A], 
                                [fdrD3_kWh_B], 
                                [fdrD3_kWh_C], 
                                [fdrD3_kWh], 
                                [fdrD3_I_A], 
                                [fdrD3_I_B], 
                                [fdrD3_I_C], 
                                [fdrD3_I_N], 
                                [fdrD3_V_A], 
                                [fdrD3_V_B], 
                                [fdrD3_V_C], 
                                [fdrD3_V_A-B], 
                                [fdrD3_V_B-C], 
                                [fdrD3_kV_C-A], 
                                [fdrD3_kW], 
                                [fdrD3_kVA], 
                                [fdrD3_kVAr], 
                                [fdrD3_kW_A], 
                                [fdrD3_kW_B], 
                                [fdrD3_kW_C], 
                                [fdrD3_kVA_A], 
                                [fdrD3_kVA_B], 
                                [fdrD3_kVA_C], 
                                [fdrD3_kVAr_A], 
                                [fdrD3_kVAr_B], 
                                [fdrD3_kVAr_C], 
                                [fdrD3_F], 
                                [fdrD3_Iang_A], 
                                [fdrD3_Iang_B], 
                                [fdrD3_Iang_C], 
                                [fdrD3_Iang_N], 
                                [fdrD3_Vang_A], 
                                [fdrD3_Vang_B], 
                                [fdrD3_Vang_C], 
                                [fdrD3_Vang_A-B], 
                                [fdrD3_Vang_B-C], 
                                [fdrD3_Vang_C-A], 
                                [fdrD3_PF_A], 
                                [fdrD3_PF_B], 
                                [fdrD3_PF_C], 
                                [fdrD3_PF], 
                                [fdrD3_Pst_V_A], 
                                [fdrD3_Pst_V_B], 
                                [fdrD3_Pst_V_C], 
                                [fdrD3_Plt_V_A], 
                                [fdrD3_Plt_V_B], 
                                [fdrD3_Plt_V_C], 
                                [fdrD3_Vdev_A], 
                                [fdrD3_Vdev_B], 
                                [fdrD3_Vdev_C], 
                                [fdrD3_Fdev], 
                                [fdrD3_THD_I_A], 
                                [fdrD3_THD_I_B], 
                                [fdrD3_THD_I_C], 
                                [fdrD3_THD_I_N], 
                                [fdrD3_THD_V_A], 
                                [fdrD3_THD_V_B], 
                                [fdrD3_THD_V_C]) ) AS pvt 
ORDER  BY datapointdate 
FOR xml path('DataRow'), root; 

因此当前程序目前需要35-65秒才能运行。当我处理超时时,我真的需要看到加速这个过程。如果有人能帮助我,我可以做些什么来帮助加快速度并摆脱在Index Spool(热切换线轴)上花费这么多时间,我将不胜感激。

编辑1:

我添加了SQL Fiddle,所以希望这会有所帮助。

3 个答案:

答案 0 :(得分:0)

这是您的枢轴解开 - 看看它是否运行得更快(我敢打赌它主要是因为CTE的优化),如果确实如此,那么你可以重新编写你的生成器来创建一个看起来像这样的查询:

WITH datelist
(
   SELECT datapointid, filename, datapointvalue
   FROM [FCPP_HPSD].[dbo].[datasharing]
   WHERE  datapointdate >= @StartDate AND datapointdate < @EndDate AND filename = @filename
)
SELECT
  SUM( j1.datepointvalue) as sum_fdrD3_kWh_A
  SUM( j2.datepointvalue) as sum_fdrD3_kWh_B 
  SUM( j3.datepointvalue) as sum_fdrD3_kWh_C 
  SUM( j4.datepointvalue) as sum_fdrD3_kWh 
  SUM( j5.datepointvalue) as sum_fdrD3_I_A 
  SUM( j6.datepointvalue) as sum_fdrD3_I_B 
  SUM( j7.datepointvalue) as sum_fdrD3_I_C 
  SUM( j8.datepointvalue) as sum_fdrD3_I_N 
  SUM( j9.datepointvalue) as sum_fdrD3_V_A 
  SUM(j10.datepointvalue) as sum_fdrD3_V_B 
  SUM(j12.datepointvalue) as sum_fdrD3_V_C 
  SUM(j13.datepointvalue) as sum_fdrD3_V_A_B 
  SUM(j14.datepointvalue) as sum_fdrD3_V_B_C 
  SUM(j15.datepointvalue) as sum_fdrD3_kV_C_A 
  SUM(j16.datepointvalue) as sum_fdrD3_kW 
  SUM(j17.datepointvalue) as sum_fdrD3_kVA 
  SUM(j18.datepointvalue) as sum_fdrD3_kVAr 
  SUM(j19.datepointvalue) as sum_fdrD3_kW_A 
  SUM(j20.datepointvalue) as sum_fdrD3_kW_B 
  SUM(j21.datepointvalue) as sum_fdrD3_kW_C 
  SUM(j22.datepointvalue) as sum_fdrD3_kVA_A 
  SUM(j23.datepointvalue) as sum_fdrD3_kVA_B 
  SUM(j24.datepointvalue) as sum_fdrD3_kVA_C 
  SUM(j25.datepointvalue) as sum_fdrD3_kVAr_A 
  SUM(j26.datepointvalue) as sum_fdrD3_kVAr_B 
  SUM(j27.datepointvalue) as sum_fdrD3_kVAr_C 
  SUM(j28.datepointvalue) as sum_fdrD3_F 
  SUM(j29.datepointvalue) as sum_fdrD3_Iang_A 
  SUM(j20.datepointvalue) as sum_fdrD3_Iang_B 
  SUM(j31.datepointvalue) as sum_fdrD3_Iang_C 
  SUM(j32.datepointvalue) as sum_fdrD3_Iang_N 
  SUM(j33.datepointvalue) as sum_fdrD3_Vang_A 
  SUM(j34.datepointvalue) as sum_fdrD3_Vang_B 
  SUM(j35.datepointvalue) as sum_fdrD3_Vang_C 
  SUM(j36.datepointvalue) as sum_fdrD3_Vang_A_B 
  SUM(j37.datepointvalue) as sum_fdrD3_Vang_B_C 
  SUM(j38.datepointvalue) as sum_fdrD3_Vang_C_A 
  SUM(j39.datepointvalue) as sum_fdrD3_PF_A 
  SUM(j40.datepointvalue) as sum_fdrD3_PF_B 
  SUM(j41.datepointvalue) as sum_fdrD3_PF_C 
  SUM(j42.datepointvalue) as sum_fdrD3_PF 
  SUM(j43.datepointvalue) as sum_fdrD3_Pst_V_A 
  SUM(j44.datepointvalue) as sum_fdrD3_Pst_V_B 
  SUM(j45.datepointvalue) as sum_fdrD3_Pst_V_C 
  SUM(j46.datepointvalue) as sum_fdrD3_Plt_V_A 
  SUM(j47.datepointvalue) as sum_fdrD3_Plt_V_B 
  SUM(j48.datepointvalue) as sum_fdrD3_Plt_V_C 
  SUM(j49.datepointvalue) as sum_fdrD3_Vdev_A 
  SUM(j50.datepointvalue) as sum_fdrD3_Vdev_B 
  SUM(j51.datepointvalue) as sum_fdrD3_Vdev_C 
  SUM(j52.datepointvalue) as sum_fdrD3_Fdev 
  SUM(j53.datepointvalue) as sum_fdrD3_THD_I_A 
  SUM(j54.datepointvalue) as sum_fdrD3_THD_I_B 
  SUM(j55.datepointvalue) as sum_fdrD3_THD_I_C 
  SUM(j56.datepointvalue) as sum_fdrD3_THD_I_N 
  SUM(j57.datepointvalue) as sum_fdrD3_THD_V_A 
  SUM(j58.datepointvalue) as sum_fdrD3_THD_V_B 
  SUM(j59.datepointvalue) as sum_fdrD3_THD_V_C
FROM   [FCPP_HPSD].[dbo].[vw_datacollection] DC 
LEFT JOIN datelist  j1 ON DC.datapointid =  j1.datapointid AND  j1.columnname = 'fdrD3_kWh_A' 
LEFT JOIN datelist  j2 ON DC.datapointid =  j2.datapointid AND  j2.columnname = 'fdrD3_kWh_B' 
LEFT JOIN datelist  j3 ON DC.datapointid =  j3.datapointid AND  j3.columnname = 'fdrD3_kWh_C' 
LEFT JOIN datelist  j4 ON DC.datapointid =  j4.datapointid AND  j4.columnname = 'fdrD3_kWh' 
LEFT JOIN datelist  j5 ON DC.datapointid =  j5.datapointid AND  j5.columnname = 'fdrD3_I_A' 
LEFT JOIN datelist  j6 ON DC.datapointid =  j6.datapointid AND  j6.columnname = 'fdrD3_I_B' 
LEFT JOIN datelist  j7 ON DC.datapointid =  j7.datapointid AND  j7.columnname = 'fdrD3_I_C' 
LEFT JOIN datelist  j8 ON DC.datapointid =  j8.datapointid AND  j8.columnname = 'fdrD3_I_N' 
LEFT JOIN datelist  j9 ON DC.datapointid =  j9.datapointid AND  j9.columnname = 'fdrD3_V_A' 
LEFT JOIN datelist j10 ON DC.datapointid = j10.datapointid AND j10.columnname = 'fdrD3_V_B' 
LEFT JOIN datelist j12 ON DC.datapointid = j12.datapointid AND j12.columnname = 'fdrD3_V_C' 
LEFT JOIN datelist j13 ON DC.datapointid = j13.datapointid AND j13.columnname = 'fdrD3_V_A-B' 
LEFT JOIN datelist j14 ON DC.datapointid = j14.datapointid AND j14.columnname = 'fdrD3_V_B-C' 
LEFT JOIN datelist j15 ON DC.datapointid = j15.datapointid AND j15.columnname = 'fdrD3_kV_C-A' 
LEFT JOIN datelist j16 ON DC.datapointid = j16.datapointid AND j16.columnname = 'fdrD3_kW' 
LEFT JOIN datelist j17 ON DC.datapointid = j17.datapointid AND j17.columnname = 'fdrD3_kVA' 
LEFT JOIN datelist j18 ON DC.datapointid = j18.datapointid AND j18.columnname = 'fdrD3_kVAr' 
LEFT JOIN datelist j19 ON DC.datapointid = j19.datapointid AND j19.columnname = 'fdrD3_kW_A' 
LEFT JOIN datelist j20 ON DC.datapointid = j20.datapointid AND j20.columnname = 'fdrD3_kW_B' 
LEFT JOIN datelist j21 ON DC.datapointid = j21.datapointid AND j21.columnname = 'fdrD3_kW_C' 
LEFT JOIN datelist j22 ON DC.datapointid = j22.datapointid AND j22.columnname = 'fdrD3_kVA_A' 
LEFT JOIN datelist j23 ON DC.datapointid = j23.datapointid AND j23.columnname = 'fdrD3_kVA_B' 
LEFT JOIN datelist j24 ON DC.datapointid = j24.datapointid AND j24.columnname = 'fdrD3_kVA_C' 
LEFT JOIN datelist j25 ON DC.datapointid = j25.datapointid AND j25.columnname = 'fdrD3_kVAr_A' 
LEFT JOIN datelist j26 ON DC.datapointid = j26.datapointid AND j26.columnname = 'fdrD3_kVAr_B' 
LEFT JOIN datelist j27 ON DC.datapointid = j27.datapointid AND j27.columnname = 'fdrD3_kVAr_C' 
LEFT JOIN datelist j28 ON DC.datapointid = j28.datapointid AND j28.columnname = 'fdrD3_F' 
LEFT JOIN datelist j29 ON DC.datapointid = j29.datapointid AND j29.columnname = 'fdrD3_Iang_A' 
LEFT JOIN datelist j20 ON DC.datapointid = j20.datapointid AND j20.columnname = 'fdrD3_Iang_B' 
LEFT JOIN datelist j31 ON DC.datapointid = j31.datapointid AND j31.columnname = 'fdrD3_Iang_C' 
LEFT JOIN datelist j32 ON DC.datapointid = j32.datapointid AND j32.columnname = 'fdrD3_Iang_N' 
LEFT JOIN datelist j33 ON DC.datapointid = j33.datapointid AND j33.columnname = 'fdrD3_Vang_A' 
LEFT JOIN datelist j34 ON DC.datapointid = j34.datapointid AND j34.columnname = 'fdrD3_Vang_B' 
LEFT JOIN datelist j35 ON DC.datapointid = j35.datapointid AND j35.columnname = 'fdrD3_Vang_C' 
LEFT JOIN datelist j36 ON DC.datapointid = j36.datapointid AND j36.columnname = 'fdrD3_Vang_A-B' 
LEFT JOIN datelist j37 ON DC.datapointid = j37.datapointid AND j37.columnname = 'fdrD3_Vang_B-C' 
LEFT JOIN datelist j38 ON DC.datapointid = j38.datapointid AND j38.columnname = 'fdrD3_Vang_C-A' 
LEFT JOIN datelist j39 ON DC.datapointid = j39.datapointid AND j39.columnname = 'fdrD3_PF_A' 
LEFT JOIN datelist j40 ON DC.datapointid = j40.datapointid AND j40.columnname = 'fdrD3_PF_B' 
LEFT JOIN datelist j41 ON DC.datapointid = j41.datapointid AND j41.columnname = 'fdrD3_PF_C' 
LEFT JOIN datelist j42 ON DC.datapointid = j42.datapointid AND j42.columnname = 'fdrD3_PF' 
LEFT JOIN datelist j43 ON DC.datapointid = j43.datapointid AND j43.columnname = 'fdrD3_Pst_V_A' 
LEFT JOIN datelist j44 ON DC.datapointid = j44.datapointid AND j44.columnname = 'fdrD3_Pst_V_B' 
LEFT JOIN datelist j45 ON DC.datapointid = j45.datapointid AND j45.columnname = 'fdrD3_Pst_V_C' 
LEFT JOIN datelist j46 ON DC.datapointid = j46.datapointid AND j46.columnname = 'fdrD3_Plt_V_A' 
LEFT JOIN datelist j47 ON DC.datapointid = j47.datapointid AND j47.columnname = 'fdrD3_Plt_V_B' 
LEFT JOIN datelist j48 ON DC.datapointid = j48.datapointid AND j48.columnname = 'fdrD3_Plt_V_C' 
LEFT JOIN datelist j49 ON DC.datapointid = j49.datapointid AND j49.columnname = 'fdrD3_Vdev_A' 
LEFT JOIN datelist j50 ON DC.datapointid = j50.datapointid AND j50.columnname = 'fdrD3_Vdev_B' 
LEFT JOIN datelist j51 ON DC.datapointid = j51.datapointid AND j51.columnname = 'fdrD3_Vdev_C' 
LEFT JOIN datelist j52 ON DC.datapointid = j52.datapointid AND j52.columnname = 'fdrD3_Fdev' 
LEFT JOIN datelist j53 ON DC.datapointid = j53.datapointid AND j53.columnname = 'fdrD3_THD_I_A' 
LEFT JOIN datelist j54 ON DC.datapointid = j54.datapointid AND j54.columnname = 'fdrD3_THD_I_B' 
LEFT JOIN datelist j55 ON DC.datapointid = j55.datapointid AND j55.columnname = 'fdrD3_THD_I_C' 
LEFT JOIN datelist j56 ON DC.datapointid = j56.datapointid AND j56.columnname = 'fdrD3_THD_I_N' 
LEFT JOIN datelist j57 ON DC.datapointid = j57.datapointid AND j57.columnname = 'fdrD3_THD_V_A' 
LEFT JOIN datelist j58 ON DC.datapointid = j58.datapointid AND j58.columnname = 'fdrD3_THD_V_B' 
LEFT JOIN datelist j59 ON DC.datapointid = j59.datapointid AND j59.columnname = 'fdrD3_THD_V_C'

答案 1 :(得分:0)

我已经删除了子查询,我希望这会加快执行速度并且不会产生错误的结果。

SELECT * 
    FROM   (SELECT [datapointdate], 
                   dp.columnname, 
                   [datapointvalue] 
            FROM   [FCPP_HPSD].[dbo].[vw_datacollection] DC 
                   JOIN [FCPP_HPSD].[dbo].[datasharing] dp 
                     ON DC.datapointid = DP.datapointid 
            WHERE  [datapointdate] >= 'Jul 15 2013 12:00AM' 
                   AND [datapointdate] < 'Jul 22 2013 12:00AM' 
                   AND dc.datapointid IN (SELECT [datapointid] 
                                  FROM   [FCPP_HPSD].[dbo].[datasharing] 
                                  WHERE  filename = 'fdrD3')) AS source
           PIVOT ( Sum(datapointvalue) 
                 FOR columnname IN (select distinct dp.columnname 
                             from [FCPP_HPSD].[dbo].[datasharing] dp ) AS pvt 
    ORDER  BY datapointdate 
    FOR xml path('DataRow'), root; 

修改 如果您想要所选数据,可能需要where子句。它适用于Oracle。 我将子查询重新放回原位并在Pivot中添加了另一个,只是为了简化代码并确保将来也能满足任何新数据。

答案 2 :(得分:0)

从SqlFiddle我建议在DataSharing(FileName,DataPointID)上添加一个额外的索引。但是,从您的评论看来,实际查询似乎只需要6秒钟(这包括将所有547k记录发送到SSMS所需的时间吗?),这样剩下的时间就会被PIVOT吸收并转换为XML?

  • 你能给出一个时间来完成SELECT所需的时间,然后将INTO抽入一个临时表吗?
  • 你能否说明从所述临时表INTO另一个临时表中执行PIVOT需要多长时间?
  • 您是否可以计算在存储到变量时从最后一个临时表转换为xml所需的时间?

代码明智我也不是子查询的粉丝,但使用WHERE EXISTS()结构似乎比直接JOIN imho更安全。然后,优化器也经常意识到这一点,并且已经为我们做了这个。因此,下面的查询计划看起来可能与原始查询计划相同。

Select * 
  From ( SELECT [DatapointDate]
                ,dp.ColumnName
                ,[DataPointValue]
           FROM [DataCollection] DC
           JOIN [Datasharing] dp
             ON DC.DataPointID = DP.DatapointID
          WHERE [DatapointDate] >= 'Jul 15 2013 12:00AM'
            AND [DatapointDate] < 'Jul 22 2013 12:00AM'
            AND EXISTS ( SELECT * 
                           FROM [DataSharing] ds 
                          WHERE ds.[FileName] = 'fdrD3' 
                            AND dc.DataPointID = ds.[DatapointID])
       ) AS source
PIVOT
    (
        SUM(DataPointValue)
        FOR ColumnName IN ([fdrD3_kWh_A],[fdrD3_kWh_B],[fdrD3_kWh_C],[fdrD3_kWh],[fdrD3_I_A],[fdrD3_I_B],[fdrD3_I_C],[fdrD3_I_N],[fdrD3_V_A],[fdrD3_V_B],[fdrD3_V_C],[fdrD3_V_A-B],[fdrD3_V_B-C],[fdrD3_kV_C-A],[fdrD3_kW],[fdrD3_kVA],[fdrD3_kVAr],[fdrD3_kW_A],[fdrD3_kW_B],[fdrD3_kW_C],[fdrD3_kVA_A],[fdrD3_kVA_B],[fdrD3_kVA_C],[fdrD3_kVAr_A],[fdrD3_kVAr_B],[fdrD3_kVAr_C],[fdrD3_F],[fdrD3_Iang_A],[fdrD3_Iang_B],[fdrD3_Iang_C],[fdrD3_Iang_N],[fdrD3_Vang_A],[fdrD3_Vang_B],[fdrD3_Vang_C],[fdrD3_Vang_A-B],[fdrD3_Vang_B-C],[fdrD3_Vang_C-A],[fdrD3_PF_A],[fdrD3_PF_B],[fdrD3_PF_C],[fdrD3_PF],[fdrD3_Pst_V_A],[fdrD3_Pst_V_B],[fdrD3_Pst_V_C],[fdrD3_Plt_V_A],[fdrD3_Plt_V_B],[fdrD3_Plt_V_C],[fdrD3_Vdev_A],[fdrD3_Vdev_B],[fdrD3_Vdev_C],[fdrD3_Fdev],[fdrD3_THD_I_A],[fdrD3_THD_I_B],[fdrD3_THD_I_C],[fdrD3_THD_I_N],[fdrD3_THD_V_A],[fdrD3_THD_V_B],[fdrD3_THD_V_C])
    ) as pvt

ORDER BY DatapointDate
FOR XML Path('DataRow'), ROOT

还有一个问题:你真的需要那里的ORDER BY DatapointDate吗?