我有一个2008 SQL服务器,有一个大表,我需要在多个列上进行COUNT DISTINCT查询。有些列是varchar,其他列是int。
到目前为止的查询如下所示:
SELECT
CAST(datepart(yyyy, [HistDate]) as varchar(4)) + '-' + CAST(datepart(mm, [HistDate]) as varchar(2)) + '-1' AS [DateSelector],
[Document] AS [Document],
-- This is the bit that needs optimizing
COUNT( DISTINCT(
Document +
Reference +
CONVERT(varchar(20),BatchID) + -- this is an int
ISNULL(CONVERT(varchar(20),ResetCount),'')) -- this is an int
FROM documents
GROUP BY
CAST(datepart(yyyy, [HistDate]) as varchar(4)) + '-' + CAST(datepart(mm, [HistDate]) as varchar(2)) + '-1' AS [DateSelector],
[Document] AS [Document],
ORDER BY ...
目前此查询需要23秒,而用COUNT(*)替换上述COUNT需要几秒钟。我尝试添加一个综合指数,产生了0个改进。我可以做些什么样的优化来加快搜索速度?
答案 0 :(得分:0)
您可以使用
来缩短时间group by datepart(yyyy, Zeitstempel), datepart(mm, Zeitstempel)
您可以仅对没有转换的整数进行分组,并仍然在选择中使用它。
答案 1 :(得分:0)
连接列并不能提高性能。
请改为尝试:
;WITH CTE AS
(
SELECT
[HistDate],
[Document] AS [Document],
row_number() over (partition by Document, Reference + BatchID + ResetCount order by (select 1)) rn
FROM documents
)
SELECT
convert(char(8),dateadd(mm,
datediff(mm, 0, [HistDate]), 0), 126)+'1' AS [DateSelector],
[Document] AS [Document],
count(*) as cnt
FROM CTE
WHERE rn = 1
GROUP BY
-- note you cant name a column in group by
dateadd(month, datediff(month, 0, [HistDate]), 0),
[Document]