在同一列上完全连接多个表

时间:2014-02-04 22:46:42

标签: sql sql-server sql-server-2008 tsql join

我有四张桌子:

╔═══╦════╗ ╔═══╦════╗ ╔═══╦════╗ ╔═══╦════╗
║ K ║ V1 ║ ║ K ║ V2 ║ ║ K ║ V3 ║ ║ K ║ V4 ║
╠═══╬════╣ ╠═══╬════╣ ╠═══╬════╣ ╠═══╬════╣
║ A ║  1 ║ ║ B ║  2 ║ ║ B ║  3 ║ ║ C ║  6 ║
║ C ║  4 ║ ║ D ║  7 ║ ║ C ║  5 ║ ║ D ║  8 ║
╚═══╩════╝ ╚═══╩════╝ ╚═══╩════╝ ╚═══╩════╝

我需要通过列K加入它们并选择V1,V2,V3和V4:

╔═══╦══════╦══════╦══════╦══════╗
║ K ║  V1  ║  V2  ║  V3  ║  V4  ║
╠═══╬══════╬══════╬══════╬══════╣
║ A ║ 1    ║ NULL ║ NULL ║ NULL ║
║ B ║ NULL ║ 2    ║ 3    ║ NULL ║
║ C ║ 4    ║ NULL ║ 5    ║ 6    ║
║ D ║ NULL ║ 7    ║ NULL ║ 8    ║
╚═══╩══════╩══════╩══════╩══════╝

解决这个问题的一种方法是:

DECLARE @T1 TABLE (K VARCHAR(1), V1 INT);
DECLARE @T2 TABLE (K VARCHAR(1), V2 INT);
DECLARE @T3 TABLE (K VARCHAR(1), V3 INT);
DECLARE @T4 TABLE (K VARCHAR(1), V4 INT);

INSERT INTO @T1 VALUES ('A', 1), ('C', 4);
INSERT INTO @T2 VALUES ('B', 2), ('D', 7);
INSERT INTO @T3 VALUES ('B', 3), ('C', 5);
INSERT INTO @T4 VALUES ('C', 6), ('D', 8);

SELECT COALESCE(t1.K, t2.K, t3.K, t4.K) [K], V1, V2, V3, V4
FROM @T1 t1
FULL JOIN @T2 t2 ON t2.K = t1.K
FULL JOIN @T3 t3 ON t3.K = t1.K OR t3.K = t2.K
FULL JOIN @T4 t4 ON t4.K = t1.K OR t4.K = t2.K OR t4.K = t3.K
ORDER BY 1;

但这只适用于这个简单的例子,如果我有更多Ks(连接条件)和Vs,连接条件最终会变得更加复杂。

如何改进此查询?

1 个答案:

答案 0 :(得分:25)

有几种方法,但我认为以下是最受欢迎的方式:

select allk.k, t1.c1, t2.v2, t3.v3, t4.v4
from (select k from t1 union 
      select k from t2 union
      select k from t3 union
      select k from t4
     ) allk left outer join
     t1
     on allk.k = t1.k left outer join
     t2
     on allk.k = t2.k left outer join
     t3
     on allk.k = t3.k left outer join
     t4
     on allk.k = t4.k;

如果您有一个包含所有k值的单独表格,则可以使用该表格代替allk子查询。

撰写full outer join查询的更简单方法是使用coalesce()

SELECT COALESCE(t1.K, t2.K, t3.K, t4.K) [K], V1, V2, V3, V4
FROM @T1 t1 FULL JOIN
     @T2 t2
     ON t2.K = t1.K FULL JOIN
     @T3 t3
     ON t3.K = coalesce(t1.K, t2.K) FULL JOIN
     @T4 t4
     ON t4.K = coalesce(t1.K, t2.K, t3.k);

第三种方法是union all / group by方法:

select k, max(v1) as v1, max(v2) as v2, max(v3) as v3, max(v4) as v4
from (select k, v1, NULL as v2, NULL as v3, NULL as v4
      from t1 union all
      select k, NULL as v1, v2, NULL as v3, NULL as v4
      from t2 union all
      select k, NULL as v1, NULL as v2, v3, NULL as v4
      from t3 union all
      select k, NULL as v1, NULL as v2, NULL as v3, v4
      from t4
     ) t
group by k;