搜索连续列时提高性能

时间:2015-02-03 12:02:36

标签: sql-server

我有一个2008 SQL服务器,有一个大表,我需要在多个列上进行COUNT DISTINCT查询。有些列是varchar,其他列是int。

到目前为止的查询如下所示:

SELECT 
    CAST(datepart(yyyy, [HistDate]) as varchar(4)) + '-' + CAST(datepart(mm, [HistDate]) as varchar(2)) + '-1' AS [DateSelector], 
    [Document] AS [Document], 
    -- This is the bit that needs optimizing
    COUNT( DISTINCT(
    Document + 
    Reference + 
    CONVERT(varchar(20),BatchID) +              -- this is an int
    ISNULL(CONVERT(varchar(20),ResetCount),'')) -- this is an int
FROM documents
GROUP BY
    CAST(datepart(yyyy, [HistDate]) as varchar(4)) + '-' + CAST(datepart(mm, [HistDate]) as varchar(2)) + '-1' AS [DateSelector], 
    [Document] AS [Document], 
ORDER BY ...

目前此查询需要23秒,而用COUNT(*)替换上述COUNT需要几秒钟。我尝试添加一个综合指数,产生了0个改进。我可以做些什么样的优化来加快搜索速度?

2 个答案:

答案 0 :(得分:0)

您可以使用

来缩短时间
group by datepart(yyyy, Zeitstempel), datepart(mm, Zeitstempel)

您可以仅对没有转换的整数进行分组,并仍然在选择中使用它。

答案 1 :(得分:0)

连接列并不能提高性能。

请改为尝试:

;WITH CTE AS
(
  SELECT 
    [HistDate],
    [Document] AS [Document], 
    row_number() over (partition by Document, Reference + BatchID + ResetCount order by (select 1)) rn
  FROM documents
)
SELECT
  convert(char(8),dateadd(mm, 
    datediff(mm, 0, [HistDate]), 0), 126)+'1' AS [DateSelector], 
  [Document] AS [Document],
  count(*) as cnt
FROM CTE
WHERE rn = 1
GROUP BY
  -- note you cant name a column in group by
  dateadd(month, datediff(month, 0, [HistDate]), 0),
  [Document]