Is there a case where adding DISTINCT would change the results of a SELECT query that uses a GROUP BY clause?
Group by and distinct produce similar execution plans.
From my understanding, tables that use a GROUP BY clause can only have columns from the GROUP BY or aggregate functions.
List of aggregate functions appears to be deterministic and combinations from the GROUP BY would be unique so my assumption is that it would be redundant.
EDIT 1: Adding the DISTINCT keyword directly after SELECT. Not anywhere in the query like @lad2025's example: SELECT name, COUNT(DISTINCT col) ... GROUP BY name.
答案 0 :(得分:3)
You are under no obligation to SELECT
all the GROUP BY
columns so in this case it would change the results.
SELECT COUNT(*)
FROM sys.objects
GROUP BY schema_id, name
--- or
SELECT DISTINCT COUNT(*)
FROM sys.objects
GROUP BY schema_id, name
答案 1 :(得分:1)
由group by
子句中出现的表达式和列定义的组在结果集中将是唯一的。只要select
列表中包含所有相同的列,distinct
就会是多余的。正如马丁史密斯所指出的那样,这不是必需的。
答案 2 :(得分:0)
Yes, it can change result when you use DISTINCT
with aggregation function:
SELECT name, COUNT(col) AS result
FROM table
GROUP BY name
vs
SELECT name, COUNT(DISTINCT col) AS result
FROM table
GROUP BY name
In other cases like:
SELECT DISTINCT name
FROM table
GROUP BY name
DISTINCT
is almost always redundant.
EDIT:
Corner case (when GROUP BY
and SELECT
column list does not match):
CREATE TABLE #tab(col1 INT, col2 INT);
INSERT INTO #tab
VALUES (1,1), (1,1), (2,1), (2,2)
SELECT DISTINCT col2
FROM #tab
GROUP BY col1, col2
SELECT col2
FROM #tab
GROUP BY col1, col2;
Output:
╔══════╗ ╔══════╗
║ col2 ║ ║ col2 ║
╠══════╣ vs ╠══════╣
║ 1 ║ ║ 1 ║
║ 2 ║ ║ 1 ║
╚══════╝ ║ 2 ║
╚══════╝