将DISTINCT与UNION / UNION ALL一起使用

时间:2013-11-01 10:13:04

标签: sql-server-2008

我偶然发现了一段看起来很慢的代码,看起来很像这样。

SELECT
  res.[X],
  res.[Y],
  SUM(res.[Z]) -- This is SUM so I have to remove duplicates
FROM (
  SELECT DISTINCT a.[X], a.[Y], b.[Z] FROM [A] a JOIN [B] b ON a.[ID] = b.[ID]
  UNION
  SELECT DISTINCT a.[X], a.[Y], c.[Z] FROM [A] a JOIN [C] c ON a.[ID] = c.[ID]
  UNION
  SELECT DISTINCT a.[X], a.[Y], d.[Z] FROM [A] a JOIN [D] d ON a.[ID] = d.[ID]
  UNION ALL -- This set won't have duplicates, hence the UNION ALL in this case
  SELECT a.[X], a.[Y], n.[Z] FROM [A] a JOIN [N] n ON a.[ID] = n.[ID]
) res
GROUP BY res.[X], res.[Y]

连接更复杂,这些UNION / UNION ALL中有12个,但是你得到了图片。每个结果集通常包含1到1,500万行。

我想知道其他人怎么会写这个查询。我读了几个警告的线程:

SELECT DISTINCT * FROM [A]
UNION
SELECT DISTINCT * FROM [B]

因为DISTINCT被调用三次(在这个小例子中)。所以我给了那个镜头并删除了DISTINCT。结果实际上很慢。我不明白删除额外的过滤会导致查询运行得更慢。

有没有人有任何想法?我正在深入研究查询计划,但发布的内容太大,所以我只是在寻找建议。谢谢!

1 个答案:

答案 0 :(得分:0)

您可以尝试使用行号并为每个X,Y和Z选择最小值

select 
    MIN(myfilter),
    X,
    Y,
    Z
SELECT 
    a.[X], 
    a.[Y], 
    b.[Z] 
RowNumber() over (order by A.X, a.Y, b.Z) as MyFilter

    FROM [A] a JOIN [B] b ON a.[ID] = b.[ID]
)
group by x,y,z