Sql server组合了多个数据集而没有重复数据

时间:2012-10-02 14:00:26

标签: sql sql-server outer-join

给出三个表Ta,Tb,Tc:

Ta(ID, Field1)
Tb(ID, Field2)
Tc(ID, Field3)

给出数据示例:

Ta
ID Field1
---------
1  A
1  B

Tb
ID Field2
---------
1  C
1  D
2  E

Tc
ID Field3
---------
1  F
2  G
2  H

问题: 如何加入此数据以返回:

ID Field1 Field2 Field3
-----------------------
1  A      C      F
1  B      D      NULL
2  NULL   E      G
2  NULL   NULL   H

我认为我可以通过外连接实现这一点,但似乎并非如此。分组的顺序并不重要,只要我带回所有信息而没有重复的行。

只是为了澄清。只要结果集以最小行数返回所有数据,我真的不介意我得到哪种组合。这是我想要做的更现实的例子:

鉴于一个人,称他为约翰。他有两个电话号码和三个电子邮件地址:

PID  Email
---------
John john@test.com
John john@mail.com
John john@john.com

PID  Tel
--------
John 011
John 022

我想回来:

PID  Email         Tel
----------------------
John john@test.com 011
John john@mail.com 022
John john@john.com NULL

2 个答案:

答案 0 :(得分:3)

您可以接近以下内容:

select coalesce(ta.id, tb.id, tc.id), ta.field1, tb.field2, tc.field3
from (select ta.*, row_number() over (partition by id order by (select NULL)) as seqnum
      from ta
     ) ta full outer join
     (select tb.*, row_number() over (partition by id order by (select NULL)) as seqnum
      from tb
     ) tb
     on ta.id = tb.id and
        ta.seqnum = tb.seqnum
     (select tc.*, row_number() over (partition by id order by (select NULL)) as seqnum
      from tc
     ) tc
     on coalesce(ta.id, tb.id) = tc.id and
        coalesce(ta.seqnum, tb.seqnum) = tc.seqnum
group by coalesce(ta.id, tb.id, tc.id),
         coalesce(ta.seqnum, tb.seqnum, tc.seqnum)
order by 1, 2

正如我所说的那样,在我的评论中,表格中的行排序并不能保证,所以这些可能不会按照您期望的顺序出现。使用示例数据,您可以使用:

over (partition by id order by field<n>)

如果字段定义了排序

答案 1 :(得分:3)

这是一个替代方案,使用CTE和Union,MIN来排除空值。它不保证订购,但是因为你说只要ID都存在就不在乎。

SQL小提琴here

WITH TaRanked AS
(
  SELECT ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Field1) as Rnk, ID, Field1
  FROM Ta
),
TbRanked AS
(
  SELECT ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Field2) as Rnk, ID, Field2
  FROM Tb
),
TcRanked AS
(
  SELECT ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Field3) as Rnk, ID, Field3
  FROM Tc
),
TUnion AS
(
    SELECT Rnk, ID, Field1, NULL AS Field2, NULL AS Field3 
        FROM TaRanked 
    UNION ALL
    SELECT Rnk, ID, NULL, Field2, NULL 
        FROM TbRanked 
    UNION ALL
    SELECT Rnk, ID, NULL, NULL, Field3 
        FROM TcRanked 
)
SELECT ID, MIN(Field1), MIN(Field2), MIN(Field3)
  FROM TUnion
  GROUP BY ID, Rnk
  ORDER BY ID, Rnk

结果是

1   A       C       F
1   B       D       (null)
2   (null)  E       G
2   (null)  (null)  H