SQL - 避免额外的GROUP BY(并提高查询性能)

时间:2013-02-20 09:38:50

标签: sql sql-server group-by sql-tuning

我坚持解决这个问题,应该很好听到新的想法:)

我有一张包含数十亿条记录的表格

TAB_IX (int) (PK)
TAB_ID (int) (PK)
PR_ID (int) (PK)
SP_ID (int) (PK)(IX)
....

之前我正在检索这样的数据

SELECT TAB_ID, COUNT (SP_ID) as HITS FROM table t
INNER JOIN table_sp s on t.SP_ID = s.ID
WHERE TAB_IX = @tab_inx 
AND PR_ID IN (SELECT PR_ID FROM @pr_id)
AND s.NAME IN (SELECT DISTINCT NAME FROM @sp_names)  
GROUP BY TAB_ID

table_sp是一个包含10k条记录的小表(ID(int)(PK),NAME(varchar)(IX))

@pr_id和@sp_names是具有一列的表变量

查询非常快(大约2-3秒);现在我不想区分具有不同PR_ID和相同TAB_IX,TAB_ID,SP_ID的记录

例如

之类的记录
TAB_IX - TAB_ID - PR_ID - SP_ID
1      - 700    - 1     - 100
1      - 700    - 2     - 100

应该被视为一个。

唯一的方法是做另外一个GROUP BY

喜欢这个

SELECT TAB_ID, COUNT(SP_ID) as HITS FROM (
SELECT TAB_ID, SP_ID, COUNT (PR_ID) FROM table 
WHERE TAB_IX = @tab_inx 
AND PR_ID in (select PR_ID from @pr_id)
AND s.NAME IN (SELECT DISTINCT NAME FROM @sp_names)
GROUP BY TAB_ID, SP_ID) AS DUMMY
GROUP BY TAB_ID

问题在于性能,因为添加这个额外的GROUP BY操作看起来非常痛苦。

您是否有任何改进查询的想法?

提前致谢:)

1 个答案:

答案 0 :(得分:1)

我认为在原始查询中指定您要计算DISTINCT SP_ID将会有所作为

SELECT TAB_ID, COUNT (DISTINCT SP_ID) as HITS FROM table t
INNER JOIN table_sp s on t.SP_ID = s.ID
WHERE TAB_IX = @tab_inx 
AND PR_ID IN (SELECT PR_ID FROM @pr_id)
AND s.NAME IN (SELECT DISTINCT NAME FROM @sp_names)  
GROUP BY TAB_ID