SQL如何选择唯一的一对一配对

时间:2013-03-16 22:34:46

标签: mysql sql

编辑以使其更清晰 - 许多对原始示例混淆的道歉

我有以下表格结构代表已婚夫妇:

id | Person | Spouse
______________________
1  | Mary   | John
2  | John   | Mary
3  | Katy   | Bob
4  | Bob    | Katy
5  | Mary   | John
6  | John   | Mary

在这个例子中,Mary与John结婚,Katy与Bob结婚,不同 Mary与另一个John结婚。

我如何找回这对已婚夫妇?

我已经接近了这个:

SELECT 
  p.id id1,
  q.id id2
FROM 
  people p 
  INNER JOIN people q ON
    p.person = q.spouse AND 
    q.person = p.spouse AND 
    p.id < q.id
ORDER BY p.id

但是这会返回:

1 | 2 (1st Mary & 1st John)
1 | 6 (1st Mary & 2nd John) *problem*
2 | 5 (1st John & 2nd Mary) *problem*
3 | 4 (Katy & Bob)
5 | 6 (2nd Mary & 2nd John)

我怎样才能确保第一个玛丽和第一个约翰只结婚一次(即删除上面的问题行)?

非常感谢

以下是创建示例的SQL:

CREATE TABLE people
    (`id` int, `person` varchar(7), `spouse` varchar(7))
;

INSERT INTO people
    (`id`, `person`, `spouse`)
VALUES
    (1, 'Mary', 'John'),
    (2, 'John', 'Mary'),
    (3, 'Katy', 'Bob'),
    (4, 'Bob', 'Katy'),
    (5, 'Mary', 'John'),
    (6, 'John', 'Mary')
;

SELECT 
  p.id id1,
  q.id id2
FROM 
  people p 
  INNER JOIN people q ON
    p.person = q.spouse AND 
    q.person = p.spouse AND 
    p.id < q.id
ORDER BY p.id
;

5 个答案:

答案 0 :(得分:1)

  

在这个例子中,玛丽与约翰结婚,凯蒂与鲍勃结婚,另一个玛丽与理查德结婚。

您的节目数据结构中的 Nothing 可以区分这两个“Marys”,因为 之间没有区别。

两者都只是文字文字Mary。如果您想区分可能具有相同名称的不同人,那么您需要另一个标准,并且唯一一个标准。 (例如,每个人的数据库记录的 id 。)

答案 1 :(得分:1)

我会试一试:

SELECT
  p.id AS id1,
  q.id AS id2
FROM
  people AS p 
  JOIN people AS q ON
    p.person = q.spouse AND 
    q.person = p.spouse AND 
    p.id < q.id
  JOIN (SELECT 
          p.id, COUNT(*) AS rank
        FROM 
          people AS p 
          INNER JOIN people AS p2 ON
            p.person = p2.person AND 
            p.spouse = p2.spouse AND 
            p.id >= p2.id
        GROUP BY p.id
       ) AS x ON
    x.id = p.id
  JOIN (SELECT 
          p.id, COUNT(*) AS rank
        FROM 
          people AS p 
          INNER JOIN people AS p2 ON
            p.person = p2.person AND 
            p.spouse = p2.spouse AND 
            p.id >= p2.id
        GROUP BY p.id
       ) AS y ON
    y.id = q.id AND
    y.rank = x.rank ;

还有一个:

SELECT
  p.id AS id1,
  q.id AS id2
FROM
  people AS p 
  JOIN people AS q ON
    p.person = q.spouse AND 
    q.person = p.spouse
  JOIN people AS p2 ON
    p.person = p2.person AND 
    p.spouse = p2.spouse AND 
    p.id >= p2.id
  JOIN people AS q2 ON
    q.person = q2.person AND 
    q.spouse = q2.spouse AND 
    q.id >= q2.id
WHERE 
    p.id < q.id
GROUP BY 
    p.id, q.id
HAVING 
    COUNT(DISTINCT p2.id) = COUNT(DISTINCT q2.id) ;

均在 SQL-Fiddle

进行了测试

如果只有MySQL具有窗口功能(就像几乎所有其他DBMS一样),这将简单得多。在Postgres fiddle测试:

WITH cte AS
  ( SELECT
        id, person, spouse, 
        ROW_NUMBER() OVER( PARTITION BY person, spouse 
                           ORDER BY id )
           AS rn
    FROM
        people
  ) 
SELECT
    p.id AS id1,
    q.id AS id2 
FROM
  cte AS p
  JOIN cte AS q ON
    p.person = q.spouse AND 
    q.person = p.spouse AND
    p.rn = q.rn AND
    p.id < q.id ;

答案 2 :(得分:0)

您的数据库限制错误。

像玛丽,约翰等人都没有身份。

某些启发式查询可能有所帮助,但它不是一个可靠的解决方案。

所以,请改进您的数据结构。

答案 3 :(得分:-1)

不是很优雅,但有效:

SELECT p.id, q.id
FROM people p
INNER JOIN people q ON
p.person1 = q.person2 and 
q.person1 = p.person1

实际上使用倒排作为选择器的存在

答案 4 :(得分:-1)

有很多方法可以做到这一点,但是使用数据库最重要的原因之一就是它拥有 lot 数据 - 而且很少有时候你会编写一个查询检索批次数据。除非在非常特殊的情况下,以及家庭作业,否则应根据某些标准过滤结果。因此,最合适的解决方案取决于您稍后添加到查询中的其他内容。

但是这里有几个如何获得独特配对的例子:

SELECT a, b, GROUP_CONCAT(id)
(SELECT id
, IF (person>=spouse, person, spouse) as a
, IF (person>=spouse, spouse, person) as b
FROM yourtable ) AS pairs
GROUP BY a,b;

SELECT id, person, spouse
FROM yourtable s1
WHERE NOT EXISTS ( SELECT 1
    FROM yourtable s2
    WHERE s2.id>s1.id
    AND s1.person=s2.spouse
    AND s1.spouse=S2.person);

(还有其他一些解决方案)。