如何改进此SQL查询?

时间:2011-08-25 23:19:52

标签: sql tsql aggregate sql-server-2008-r2

我今天遇到了一个有趣的SQL问题,当我提出一个有效的解决方案时,我怀疑这是最好或最有效的答案。我尊敬这里的专家 - 帮助我学习一些东西并改进我的疑问! RDBMS是SQL Server 2008 R2,查询是SSRS报告的一部分,该报告将针对大约100,000行运行。

基本上我有一个ID列表,可以有多个与之关联的值,值为Yes,No或其他字符串。对于ID x,如果任何值为Yes,则x应为Yes,如果它们都为No,则它应为No,如果它们包含任何其他值但是yes和no,则显示该值。我只想每个ID返回1行,没有重复。

简化版和测试用例:

DECLARE @tempTable table ( ID int, Val varchar(1) )

INSERT INTO @tempTable ( ID, Val ) VALUES ( 10, 'Y')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 11, 'N')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 11, 'N')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 12, 'Y')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 12, 'Y')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 12, 'Y')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 13, 'N')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 14, 'Y')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 14, 'N')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 15, 'Y')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 16, 'Y')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 17, 'F')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 18, 'P')


SELECT DISTINCT t.ID, COALESCE(t2.Val, t3.Val, t4.Val)
FROM @tempTable t
LEFT JOIN
(
    SELECT ID, Val
    FROM @tempTable
    WHERE Val = 'Y'
) t2 ON t.ID = t2.ID
LEFT JOIN
(
    SELECT 
    ID, Val FROM @tempTable
    WHERE Val = 'N'
) t3 ON t.ID = t3.ID
LEFT JOIN
(
    SELECT ID, Val
    FROM @tempTable
    WHERE Val <> 'Y' AND Val <> 'N'
) t4 ON t.ID = t4.ID

提前致谢。

4 个答案:

答案 0 :(得分:4)

让我们回答一个更简单的问题:对于每个id,获取字母表中最后一个的Val。如果Y和N是唯一的值,这将起作用。查询更简单:

SELECT t.ID, MAX(t.Val) FROM t GROUP BY t.ID;

因此,将案例简化为简单案例。使用枚举(如果你的数据库支持它)或将值代码分解到带有排序列的另一个表中(在这种情况下,你可以有1表示Y,2表示N,999表示所有其他可能的值,你想要最小)。然后

SELECT ID, c.Val FROM
     (SELECT t.ID, MIN(codes.collation) AS mx
      FROM t join codes on t.Val = codes.Val GROUP BY t.ID) AS q
JOIN codes c ON mx=c.collation;

此处代码有两列,Val和Collat​​ion。

您也可以使用CTE类型查询执行此操作,只要您根据需要订购了值。这种方法有一个连接到一个小的查找表,应该比3个自连接快得多。

WITH q AS (SELECT t.id, t.Val, ROW_NUMBER() AS r FROM t JOIN codes ON t.Val=codes.Val 
    PARTITION BY t.id ORDER BY codes.collation)
SELECT q.id, q.Val WHERE r=1;            

答案 1 :(得分:3)

为了便于阅读,我将其更改为:

SELECT DISTINCT t.ID, COALESCE(t2.Val, t3.Val, t4.Val)
FROM @tempTable t
LEFT JOIN @tempTable t2 ON t.ID = t2.ID and t2.Val = 'Y'
LEFT JOIN @tempTable t3 ON t.ID = t3.ID and  t3.Val = 'N'
LEFT JOIN @tempTable t4 ON t.ID = t4.ID and t4.Val <> 'Y' AND t4.Val <> 'N'

给出与您的示例相同的结果。

我也查看了两者的执行计划,它们看起来完全相同,我怀疑你会看到任何性能差异。

答案 2 :(得分:3)

试试这个:

;WITH a AS ( 
SELECT
  ID,
  SUM(CASE Val WHEN 'Y' THEN 1 ELSE 0 END) AS y,
  SUM(CASE Val WHEN 'N' THEN 0 ELSE 1 END) AS n,
  MIN(CASE WHEN Val IN ('Y','N') THEN NULL ELSE Val END) AS first_other
FROM @tempTable
GROUP BY ID
) 
SELECT
  ID,
  CASE WHEN y > 0 THEN 'Y' WHEN n = 0 THEN 'N' ELSE first_other END AS Val
FROM a 
  • 如果有任何'Y'值,则y的总和将大于0
  • 如果所有值都是'N',则n的总和将为零
  • 如果需要,可以获得第一个非'Y'或'N'字符
  • 在这种情况下,只需一次通过即可确定结果 表

答案 3 :(得分:2)

我正在阅读你的规范:

  1. 如果任何ID是Y,那么Y
  2. 如果所有ID都是N则N
  3. 其他显示值(Y或N除外)
  4. 消除每(1)行

    delete from @tempTable
    where not Val='Y' and ID in (
        select distinct ID
        from @tempTable
        where Val='Y'
    )
    

    选择distinct以消除每(2)个多个N.

    select distinct * from @tempTable
    

    将各种“其他”值分组以获得每个ID的单行。

    SELECT A.Id, AllVals = 
        SubString(
            (SELECT ', ' + B.Val 
             FROM C as B 
             WHERE A.Id = B.Id 
             FOR XML PATH ( '' ) ), 3, 1000) 
    FROM C as A 
    GROUP BY Id
    

    整个可运行的查询:

    declare @tempTable table (ID int, Val char(1))
    INSERT INTO @tempTable ( ID, Val ) VALUES ( 10, 'Y') 
    INSERT INTO @tempTable ( ID, Val ) VALUES ( 11, 'N') 
    INSERT INTO @tempTable ( ID, Val ) VALUES ( 11, 'N') 
    INSERT INTO @tempTable ( ID, Val ) VALUES ( 12, 'Y') 
    INSERT INTO @tempTable ( ID, Val ) VALUES ( 12, 'Y') 
    INSERT INTO @tempTable ( ID, Val ) VALUES ( 12, 'Y') 
    INSERT INTO @tempTable ( ID, Val ) VALUES ( 13, 'N') 
    INSERT INTO @tempTable ( ID, Val ) VALUES ( 14, 'Y') 
    INSERT INTO @tempTable ( ID, Val ) VALUES ( 14, 'N') 
    INSERT INTO @tempTable ( ID, Val ) VALUES ( 15, 'Y') 
    INSERT INTO @tempTable ( ID, Val ) VALUES ( 16, 'Y') 
    INSERT INTO @tempTable ( ID, Val ) VALUES ( 17, 'F') 
    INSERT INTO @tempTable ( ID, Val ) VALUES ( 18, 'P')
    INSERT INTO @tempTable ( ID, Val ) VALUES ( 18, 'F')
    delete from @tempTable
    where not Val='Y' and ID in (
        select distinct ID
        from @tempTable
        where Val='Y'
    );
    WITH C as (select distinct * from @tempTable)
    SELECT A.Id, AllVals = 
        SubString(
            (SELECT ', ' + B.Val 
             FROM C as B 
             WHERE A.Id = B.Id 
             FOR XML PATH ( '' ) ), 3, 1000) 
    FROM C as A 
    GROUP BY Id
    

    输出:

    Id  AllVals
    10  Y
    11  N
    12  Y
    13  N
    14  Y
    15  Y
    16  Y
    17  F
    18  F, P