Is SELECT DISTINCT always redundant when using a GROUP BY clause?

时间:2015-12-14 17:55:59

标签: sql sql-server select group-by distinct

Is there a case where adding DISTINCT would change the results of a SELECT query that uses a GROUP BY clause?

Group by and distinct produce similar execution plans.

From my understanding, tables that use a GROUP BY clause can only have columns from the GROUP BY or aggregate functions.

List of aggregate functions appears to be deterministic and combinations from the GROUP BY would be unique so my assumption is that it would be redundant.

EDIT 1: Adding the DISTINCT keyword directly after SELECT. Not anywhere in the query like @lad2025's example: SELECT name, COUNT(DISTINCT col) ... GROUP BY name.

3 个答案:

答案 0 :(得分:3)

You are under no obligation to SELECT all the GROUP BY columns so in this case it would change the results.

SELECT COUNT(*)
FROM sys.objects
GROUP BY schema_id, name

--- or

SELECT DISTINCT COUNT(*)
FROM sys.objects
GROUP BY schema_id, name

答案 1 :(得分:1)

group by子句中出现的表达式和列定义的组在结果集中将是唯一的。只要select列表中包含所有相同的列,distinct就会是多余的。正如马丁史密斯所指出的那样,这不是必需的。

答案 2 :(得分:0)

Yes, it can change result when you use DISTINCT with aggregation function:

SELECT name, COUNT(col) AS result
FROM table
GROUP BY name

vs

SELECT name, COUNT(DISTINCT col) AS result
FROM table
GROUP BY name

In other cases like:

SELECT DISTINCT name
FROM table
GROUP BY name

DISTINCT is almost always redundant.

EDIT:

Corner case (when GROUP BY and SELECT column list does not match):

CREATE TABLE #tab(col1 INT, col2 INT);

INSERT INTO #tab
VALUES (1,1), (1,1), (2,1), (2,2)

SELECT DISTINCT col2
FROM #tab
GROUP BY col1, col2

SELECT col2
FROM #tab
GROUP BY col1, col2;

LiveDemo

Output:

╔══════╗                 ╔══════╗
║ col2 ║                 ║ col2 ║
╠══════╣      vs         ╠══════╣    
║    1 ║                 ║    1 ║           
║    2 ║                 ║    1 ║           
╚══════╝                 ║    2 ║
                         ╚══════╝