编辑以使其更清晰 - 许多对原始示例混淆的道歉
我有以下表格结构代表已婚夫妇:
id | Person | Spouse
______________________
1 | Mary | John
2 | John | Mary
3 | Katy | Bob
4 | Bob | Katy
5 | Mary | John
6 | John | Mary
在这个例子中,Mary与John结婚,Katy与Bob结婚,不同 Mary与另一个John结婚。
我如何找回这对已婚夫妇?
我已经接近了这个:
SELECT
p.id id1,
q.id id2
FROM
people p
INNER JOIN people q ON
p.person = q.spouse AND
q.person = p.spouse AND
p.id < q.id
ORDER BY p.id
但是这会返回:
1 | 2 (1st Mary & 1st John)
1 | 6 (1st Mary & 2nd John) *problem*
2 | 5 (1st John & 2nd Mary) *problem*
3 | 4 (Katy & Bob)
5 | 6 (2nd Mary & 2nd John)
我怎样才能确保第一个玛丽和第一个约翰只结婚一次(即删除上面的问题行)?
非常感谢
以下是创建示例的SQL:
CREATE TABLE people
(`id` int, `person` varchar(7), `spouse` varchar(7))
;
INSERT INTO people
(`id`, `person`, `spouse`)
VALUES
(1, 'Mary', 'John'),
(2, 'John', 'Mary'),
(3, 'Katy', 'Bob'),
(4, 'Bob', 'Katy'),
(5, 'Mary', 'John'),
(6, 'John', 'Mary')
;
SELECT
p.id id1,
q.id id2
FROM
people p
INNER JOIN people q ON
p.person = q.spouse AND
q.person = p.spouse AND
p.id < q.id
ORDER BY p.id
;
答案 0 :(得分:1)
您的节目数据结构中的 Nothing 可以区分这两个“Marys”,因为 之间没有区别。在这个例子中,玛丽与约翰结婚,凯蒂与鲍勃结婚,另一个玛丽与理查德结婚。
两者都只是文字文字Mary
。如果您想区分可能具有相同名称的不同人,那么您需要另一个标准,并且唯一一个标准。 (例如,每个人的数据库记录的 id 。)
答案 1 :(得分:1)
我会试一试:
SELECT
p.id AS id1,
q.id AS id2
FROM
people AS p
JOIN people AS q ON
p.person = q.spouse AND
q.person = p.spouse AND
p.id < q.id
JOIN (SELECT
p.id, COUNT(*) AS rank
FROM
people AS p
INNER JOIN people AS p2 ON
p.person = p2.person AND
p.spouse = p2.spouse AND
p.id >= p2.id
GROUP BY p.id
) AS x ON
x.id = p.id
JOIN (SELECT
p.id, COUNT(*) AS rank
FROM
people AS p
INNER JOIN people AS p2 ON
p.person = p2.person AND
p.spouse = p2.spouse AND
p.id >= p2.id
GROUP BY p.id
) AS y ON
y.id = q.id AND
y.rank = x.rank ;
还有一个:
SELECT
p.id AS id1,
q.id AS id2
FROM
people AS p
JOIN people AS q ON
p.person = q.spouse AND
q.person = p.spouse
JOIN people AS p2 ON
p.person = p2.person AND
p.spouse = p2.spouse AND
p.id >= p2.id
JOIN people AS q2 ON
q.person = q2.person AND
q.spouse = q2.spouse AND
q.id >= q2.id
WHERE
p.id < q.id
GROUP BY
p.id, q.id
HAVING
COUNT(DISTINCT p2.id) = COUNT(DISTINCT q2.id) ;
均在 SQL-Fiddle
进行了测试如果只有MySQL具有窗口功能(就像几乎所有其他DBMS一样),这将简单得多。在Postgres fiddle测试:
WITH cte AS
( SELECT
id, person, spouse,
ROW_NUMBER() OVER( PARTITION BY person, spouse
ORDER BY id )
AS rn
FROM
people
)
SELECT
p.id AS id1,
q.id AS id2
FROM
cte AS p
JOIN cte AS q ON
p.person = q.spouse AND
q.person = p.spouse AND
p.rn = q.rn AND
p.id < q.id ;
答案 2 :(得分:0)
您的数据库限制错误。
像玛丽,约翰等人都没有身份。
某些启发式查询可能有所帮助,但它不是一个可靠的解决方案。
所以,请改进您的数据结构。
答案 3 :(得分:-1)
不是很优雅,但有效:
SELECT p.id, q.id
FROM people p
INNER JOIN people q ON
p.person1 = q.person2 and
q.person1 = p.person1
实际上使用倒排作为选择器的存在
答案 4 :(得分:-1)
有很多方法可以做到这一点,但是使用数据库最重要的原因之一就是它拥有 lot 数据 - 而且很少有时候你会编写一个查询检索批次数据。除非在非常特殊的情况下,以及家庭作业,否则应根据某些标准过滤结果。因此,最合适的解决方案取决于您稍后添加到查询中的其他内容。
但是这里有几个如何获得独特配对的例子:
SELECT a, b, GROUP_CONCAT(id)
(SELECT id
, IF (person>=spouse, person, spouse) as a
, IF (person>=spouse, spouse, person) as b
FROM yourtable ) AS pairs
GROUP BY a,b;
SELECT id, person, spouse
FROM yourtable s1
WHERE NOT EXISTS ( SELECT 1
FROM yourtable s2
WHERE s2.id>s1.id
AND s1.person=s2.spouse
AND s1.spouse=S2.person);
(还有其他一些解决方案)。