在填充值时合并两个版本跟踪表

时间:2012-10-19 22:22:42

标签: sql postgresql

我有两个跟踪数据库值更改的历史记录表,使用修订ID来跟踪各个更改。 e.g。

表1:

 rev |  A   |  B 
=================
 1   |  100 | 'A'
 4   |  150 | 'A'
 7   |  100 | 'Z'

表2:

 rev |  C   |  D 
==================
 1   |  200 | True
 5   |    0 | True
 8   |    0 | False

目标是将两个表合并为:

 rev |  A   |  B  |  C  |  D 
===============================
 1   |  100 | 'A' | 200 | True
 4   |  150 | 'A' | 200 | True
 5   |  150 | 'A' |   0 | True
 7   |  100 | 'Z' |   0 | True
 8   |  100 | 'Z' |   0 | False

这个想法是,对于给定的修订,我会采用与该修订相对应的值或者低于它的最高版本。

想到的SQL查询类似于使用约束 rev1<交叉连接两个表。 rev2 ,然后使用子查询选择行,其中 rev1 = max(rev1)为每个给定的 rev2 ;将此查询与其对应方交换 rev2 rev1 ;最后从 rev1 = rev2

中过滤出重复项

问题是:

  • 此类联接是否有名称?
  • 是否有在SQL中执行此类连接的习惯用法,或者以编程方式执行此操作会更好(这肯定会更简单,也可能更高效)?

3 个答案:

答案 0 :(得分:2)

SQL Fiddle

select
    coalesce(t1.rev, t2.rev) rev,
    coalesce(a, lag(a, 1) over(order by coalesce(t2.rev, t1.rev))) a,
    coalesce(b, lag(b, 1) over(order by coalesce(t2.rev, t1.rev))) b,
    coalesce(c, lag(c, 1) over(order by coalesce(t1.rev, t2.rev))) c,
    coalesce(d, lag(d, 1) over(order by coalesce(t1.rev, t2.rev))) d
from
    t1
    full join
    t2 on t1.rev = t2.rev
order by rev

答案 1 :(得分:1)

这可以通过子查询

来实现
SELECT ISNULL(Table1.rev,Table2.rev) AS rev
,ISNULL(A,(SELECT TOP 1 A FROM Table1 AS T1 WHERE ISNULL(Table1.rev,Table2.rev) > T1.rev AND A IS NOT NULL ORDER BY rev DESC)) AS A
,ISNULL(B,(SELECT TOP 1 B FROM Table1 AS T1 WHERE ISNULL(Table1.rev,Table2.rev) > T1.rev AND B IS NOT NULL ORDER BY rev DESC)) AS B
,ISNULL(C,(SELECT TOP 1 C FROM Table2 AS T2 WHERE ISNULL(Table1.rev,Table2.rev) > T2.rev AND C IS NOT NULL ORDER BY rev DESC)) AS C
,ISNULL(D,(SELECT TOP 1 D FROM Table2 AS T2 WHERE ISNULL(Table1.rev,Table2.rev) > T2.rev AND D IS NOT NULL ORDER BY rev DESC)) AS D
FROM Table1
FULL OUTER JOIN Table2
ON Table1.rev = Table2.rev

答案 2 :(得分:0)

没有特定的连接类型来处理这种查询。您必须以复杂查询或以编程方式执行此操作。下面是使用示例数据解决此问题的PL / PGSQL代码示例。

CREATE OR REPLACE FUNCTION getRev(OUT rev INT, OUT A INT, OUT B CHAR, OUT C INT, OUT D BOOL) RETURNS SETOF record STABLE AS
$BODY$
DECLARE
    c1 SCROLL CURSOR FOR SELECT * FROM Table1 ORDER BY rev;
    c2 SCROLL CURSOR FOR SELECT * FROM Table2 ORDER BY rev;
    r1    Table1%ROWTYPE;
    r1c   Table1%ROWTYPE;
    r2    Table2%ROWTYPE;
    r2c   Table2%ROWTYPE;
BEGIN
  OPEN c1;
  OPEN c2;
  FETCH c1 INTO r1;
  FETCH c2 INTO r2;
  r1c := r1;
  r2c := r2;
  WHILE r1 IS NOT NULL AND r2 IS NOT NULL
  LOOP
    CASE 
    WHEN r1.rev = r2.rev THEN 
      rev := r1.rev;
      A := r1.a;
      B := r1.b;
      C := r2.c;
      D := r2.d;
      FETCH c1 INTO r1c;
      FETCH c2 INTO r2c;
      CASE 
        WHEN r1c.rev = r2c.rev THEN
      r1 := r1c;
      r2 := r2c;
        WHEN r1c.rev < r2c.rev THEN
          r1 := r1c;
      FETCH PRIOR FROM c2 INTO r2c;
    ELSE
          r2 := r2c;
      FETCH PRIOR FROM c1 INTO r1c;
      END CASE;
    WHEN r1.rev < r2.rev THEN
      WHILE r1c IS NOT NULL AND r1c.rev < r2.rev LOOP
         r1 := r1c;
         FETCH c1 INTO r1c;
      END LOOP;
      rev := r2.rev;
      A := r1.a;
      B := r1.b;
      C := r2.c;
      D := r2.d;
      r1 := r1c;
    ELSE 
      WHILE r2c IS NOT NULL AND r2c.rev < r1.rev LOOP
         r2 := r2c;
         FETCH c2 INTO r2c;
      END LOOP;
      rev := r1.rev;
      A := r1.a;
      B := r1.b;
      C := r2.c;
      D := r2.d;
      r2 := r2c;
    END CASE;
    RETURN NEXT;
  END LOOP;
  CLOSE c1;
  CLOSE c2;
  RETURN;
END
$BODY$
LANGUAGE 'plpgsql';

这应该以O(长度(表1)+长度(表2))运行。

请注意“CASE WHEN r1.rev = r2.rev”中的棘手部分:我们必须选择在哪个表上继续扫描下一次迭代。正确的是光标后具有最小rev值的那个,以通过两个表中可用的所有转数。通过用C或C ++编写代码,你当然可以获得更好的性能。