我有许多表遵循这种相当常见的模式:A <-->> B
。我想在表 A 中找到匹配的行对,其中某些列具有相等的值,并且还在 B 中引用行,其中某些列具有相等的值。换句话说, A 中的一对(R,S)匹配,iff用于给定的列 {a 1 , A 中的 2 ,..., n } 和 {b 1 ,b 2 ,...,b n } in B :
(我对关系代数不太熟悉,所以我上面的定义可能不符合任何惯例。)
我提出的方法是:
但是,我在(下面)为第2步和第3步编写的查询,以查找 B 中的匹配行,这是非常复杂的。有更好的解决方案吗?
-- Tables similar to those that I have.
CREATE TABLE a (
id INTEGER PRIMARY KEY,
data TEXT
);
CREATE TABLE b (
id INTEGER PRIMARY KEY,
a_id INTEGER REFERENCES a (id),
data TEXT
);
SELECT DISTINCT dup.lhs_parent_id, dup.rhs_parent_id
FROM (
SELECT DISTINCT
MIN(lhs.a_id, rhs.a_id) AS lhs_parent_id, -- Normalize.
MAX(lhs.a_id, rhs.a_id) AS rhs_parent_id,
COUNT(*) AS count
FROM b lhs
INNER JOIN b rhs USING (data)
WHERE NOT (lhs.id = rhs.id OR lhs.a_id = rhs.a_id) -- Remove self-matching rows and duplicate values with the same parent.
GROUP BY lhs.a_id, rhs.a_id
) dup
INNER JOIN ( -- Check that lhs has the same number of rows.
SELECT
a_id AS parent_id,
COUNT(*) AS count
FROM b
GROUP BY a_id
) lhs_ct ON (
dup.lhs_parent_id = lhs_ct.parent_id AND
dup.count = lhs_ct.count
)
INNER JOIN ( -- Check that rhs has the same number of rows.
SELECT
a_id AS parent_id,
COUNT(*) AS count
FROM b
GROUP BY a_id
) rhs_ct ON (
dup.rhs_parent_id = rhs_ct.parent_id AND
dup.count = rhs_ct.count
);
-- Test data.
-- Expected query result is three rows with values (1, 2), (1, 3) and (2, 3) for a_id,
-- since the first three rows (with values 'row 1', 'row 2' and 'row 3')
-- have referencing rows, each of which has a matching pair. The fourth row
-- ('row 3') only has one referencing row with the value 'foo', so it doesn't have a
-- pair for the referenced rows with the value 'bar'.
INSERT INTO a (id, data) VALUES
(1, 'row 1'),
(2, 'row 2'),
(3, 'row 3'),
(4, 'row 4');
INSERT INTO b (id, a_id, data) VALUES
(1, 1, 'foo'),
(2, 1, 'bar'),
(3, 2, 'foo'),
(4, 2, 'bar'),
(5, 3, 'foo'),
(6, 3, 'bar'),
(7, 4, 'foo');
我正在使用SQLite。
答案 0 :(得分:1)
要查找匹配和不同的行,更容易使用INTERSECT和MINUS操作然后加入......
但是当比较JOIN解决方案中只有一个实际使用的字段看起来更好时:
Select B1.A_Id, B2.A_Id
From (
Select Data, A_Id, Count(Id) A_Count
From B
Group By Data, A_Id
) b1
inner join (
Select Data, A_Id, Count(Id) a_count
From B Group By Data, A_Id
) b2 on b1.data = b2.data and b1.a_count = b2.a_count and b1.a_id <> b2.a_id
据我了解,您需要找出具有相同数据和数据计数的不同a_id对。
我的脚本的结果给出了两个方向上的可能耦合,为SQLlite特定语法留下了优化空间。
结果示例: {1,2},{1,3},{2,1},{2,3},{3,2},{3,1}