我有两个跟踪数据库值更改的历史记录表,使用修订ID来跟踪各个更改。 e.g。
表1:
rev | A | B
=================
1 | 100 | 'A'
4 | 150 | 'A'
7 | 100 | 'Z'
表2:
rev | C | D
==================
1 | 200 | True
5 | 0 | True
8 | 0 | False
目标是将两个表合并为:
rev | A | B | C | D
===============================
1 | 100 | 'A' | 200 | True
4 | 150 | 'A' | 200 | True
5 | 150 | 'A' | 0 | True
7 | 100 | 'Z' | 0 | True
8 | 100 | 'Z' | 0 | False
这个想法是,对于给定的修订,我会采用与该修订相对应的值或者低于它的最高版本。
想到的SQL查询类似于使用约束 rev1<交叉连接两个表。 rev2 ,然后使用子查询选择行,其中 rev1 = max(rev1)为每个给定的 rev2 ;将此查询与其对应方交换 rev2 和 rev1 ;最后从 rev1 = rev2 。
中过滤出重复项问题是:
答案 0 :(得分:2)
select
coalesce(t1.rev, t2.rev) rev,
coalesce(a, lag(a, 1) over(order by coalesce(t2.rev, t1.rev))) a,
coalesce(b, lag(b, 1) over(order by coalesce(t2.rev, t1.rev))) b,
coalesce(c, lag(c, 1) over(order by coalesce(t1.rev, t2.rev))) c,
coalesce(d, lag(d, 1) over(order by coalesce(t1.rev, t2.rev))) d
from
t1
full join
t2 on t1.rev = t2.rev
order by rev
答案 1 :(得分:1)
这可以通过子查询
来实现SELECT ISNULL(Table1.rev,Table2.rev) AS rev
,ISNULL(A,(SELECT TOP 1 A FROM Table1 AS T1 WHERE ISNULL(Table1.rev,Table2.rev) > T1.rev AND A IS NOT NULL ORDER BY rev DESC)) AS A
,ISNULL(B,(SELECT TOP 1 B FROM Table1 AS T1 WHERE ISNULL(Table1.rev,Table2.rev) > T1.rev AND B IS NOT NULL ORDER BY rev DESC)) AS B
,ISNULL(C,(SELECT TOP 1 C FROM Table2 AS T2 WHERE ISNULL(Table1.rev,Table2.rev) > T2.rev AND C IS NOT NULL ORDER BY rev DESC)) AS C
,ISNULL(D,(SELECT TOP 1 D FROM Table2 AS T2 WHERE ISNULL(Table1.rev,Table2.rev) > T2.rev AND D IS NOT NULL ORDER BY rev DESC)) AS D
FROM Table1
FULL OUTER JOIN Table2
ON Table1.rev = Table2.rev
答案 2 :(得分:0)
没有特定的连接类型来处理这种查询。您必须以复杂查询或以编程方式执行此操作。下面是使用示例数据解决此问题的PL / PGSQL代码示例。
CREATE OR REPLACE FUNCTION getRev(OUT rev INT, OUT A INT, OUT B CHAR, OUT C INT, OUT D BOOL) RETURNS SETOF record STABLE AS
$BODY$
DECLARE
c1 SCROLL CURSOR FOR SELECT * FROM Table1 ORDER BY rev;
c2 SCROLL CURSOR FOR SELECT * FROM Table2 ORDER BY rev;
r1 Table1%ROWTYPE;
r1c Table1%ROWTYPE;
r2 Table2%ROWTYPE;
r2c Table2%ROWTYPE;
BEGIN
OPEN c1;
OPEN c2;
FETCH c1 INTO r1;
FETCH c2 INTO r2;
r1c := r1;
r2c := r2;
WHILE r1 IS NOT NULL AND r2 IS NOT NULL
LOOP
CASE
WHEN r1.rev = r2.rev THEN
rev := r1.rev;
A := r1.a;
B := r1.b;
C := r2.c;
D := r2.d;
FETCH c1 INTO r1c;
FETCH c2 INTO r2c;
CASE
WHEN r1c.rev = r2c.rev THEN
r1 := r1c;
r2 := r2c;
WHEN r1c.rev < r2c.rev THEN
r1 := r1c;
FETCH PRIOR FROM c2 INTO r2c;
ELSE
r2 := r2c;
FETCH PRIOR FROM c1 INTO r1c;
END CASE;
WHEN r1.rev < r2.rev THEN
WHILE r1c IS NOT NULL AND r1c.rev < r2.rev LOOP
r1 := r1c;
FETCH c1 INTO r1c;
END LOOP;
rev := r2.rev;
A := r1.a;
B := r1.b;
C := r2.c;
D := r2.d;
r1 := r1c;
ELSE
WHILE r2c IS NOT NULL AND r2c.rev < r1.rev LOOP
r2 := r2c;
FETCH c2 INTO r2c;
END LOOP;
rev := r1.rev;
A := r1.a;
B := r1.b;
C := r2.c;
D := r2.d;
r2 := r2c;
END CASE;
RETURN NEXT;
END LOOP;
CLOSE c1;
CLOSE c2;
RETURN;
END
$BODY$
LANGUAGE 'plpgsql';
这应该以O(长度(表1)+长度(表2))运行。
请注意“CASE WHEN r1.rev = r2.rev”中的棘手部分:我们必须选择在哪个表上继续扫描下一次迭代。正确的是光标后具有最小rev值的那个,以通过两个表中可用的所有转数。通过用C或C ++编写代码,你当然可以获得更好的性能。