从两个表中选择匹配对

时间:2013-04-17 05:08:41

标签: mysql sql

我需要从包含类似结构化数据的两个表中选择匹配对。 “匹配对”在这里表示在“匹配”列中相互引用的两行。

单表匹配对示例:

TABLE
----
id | matchid
1  |   2
2  |   1

ID 1和2是匹配对,因为每个匹配对都有匹配条目。

现在真正的问题是:选择两个表中出现的匹配对的最佳(最快)方法是什么:

Table ONE (id, matchid)
Table TWO (id, matchid)

示例数据:

ONE                TWO
----               ----
id  | matchid      id  | matchid
1   |   2          2   |   3
2   |   3          3   |   2
3   |   2
4   |   5
5   |   4

所需的结果是ID为2和3的单行。

RESULT
----
id  | id
2   | 3

这是因为2& 3是表ONE和表TWO中的匹配对。 4& 5是表ONE中的匹配对但不是TWO,所以我们不选择它们。 1和2根本不是匹配对,因为2没有1的匹配条目。

我可以通过以下方式从一个表中获取匹配的对:

SELECT a.id, b.id 
    FROM ONE a JOIN ONE b
       ON a.id = b.matchid AND a.matchid = b.id
    WHERE a.id < b.id

我应该如何构建一个只选择两个表中出现的匹配对的查询?

我应该:

  • 为每个表选择上面的查询,并将它们放在一起?
  • 为每个表选择上面的查询并将它们连接在一起?
  • 选择上面的查询,然后选择两次表格两次,一次是'id',一次是'matchid'?
  • 为每个表选择上面的查询并循环以在php中比较它们?
  • 以某种方式过滤表格向下,所以我们只需要查看表格ONE中匹配对的ID?
  • 做一些完全不同的事情?

(由于这是一个效率问题,值得注意的是匹配将非常稀疏,可能是1/1000或更少,每个表将有100,000多行。)

3 个答案:

答案 0 :(得分:1)

我想我明白你的观点。您想要过滤两个表中存在的对的记录。

SELECT  LEAST(a.ID, a.MatchID) ID, GREATEST(a.ID, a.MatchID) MatchID
FROM    One a
        INNER JOIN Two b
            ON a.ID = b.ID AND
                a.matchID = b.matchID
GROUP   BY LEAST(a.ID, a.MatchID), GREATEST(a.ID, a.MatchID)
HAVING  COUNT(*) > 1

答案 1 :(得分:0)

尝试此查询:

   select 
    O.id,
    O.matchid
    from 
    ONE O
    where 
    (CAST(O.id as CHAR(50))+'~'+CAST(O.matchid as CHAR(50)))
    in (select CAST(T.id as CHAR(50))+'~'+CAST(T.matchid as CHAR(50)) from TWO T)

已编辑查询:

select distinct
Least(O.id,O.matchid) ID,
Greatest(O.id,O.matchid) MatchID
from 
ONE O
where 
(CAST(O.id as CHAR(50))+'~'+CAST(O.matchid as CHAR(50)))
in (select CAST(T.id as CHAR(50))+'~'+CAST(T.matchid as CHAR(50)) from TWO T)
and (CAST(O.matchid as CHAR(50))+'~'+CAST(O.id as CHAR(50)))
in (select CAST(T.id as CHAR(50))+'~'+CAST(T.matchid as CHAR(50)) from TWO T)

<强> SQL Fiddle

答案 2 :(得分:0)

Naive版本,用于检查所有需要存在的所有四行

-- EXPLAIN ANALYZE
WITH both_one AS (
        SELECT o.id, o.matchid
        FROM one o
        WHERE o.id < o.matchid
        AND EXISTS ( SELECT * FROM one x WHERE x.id = o.matchid AND x.matchid = o.id)
        )
, both_two AS (
        SELECT t.id, t.matchid
        FROM two t
        WHERE t.id < t.matchid
        AND EXISTS ( SELECT * FROM two x WHERE x.id = t.matchid AND x.matchid = t.id)
        )
SELECT *
FROM both_one oo
WHERE EXISTS (
        SELECT *
        FROM both_two tt
        WHERE tt.id = oo.id AND tt.matchid = oo.matchid
        );

这个更简单:

-- EXPLAIN ANALYZE
WITH pair AS (
        SELECT o.id, o.matchid
        FROM one o
        WHERE EXISTS ( SELECT * FROM two x WHERE x.id = o.id AND x.matchid = o.matchid)
        )
SELECT *
FROM pair pp
WHERE EXISTS (
        SELECT *
        FROM pair xx
        WHERE xx.id = pp.matchid AND xx.matchid = pp.id
        )
AND pp.id < pp.matchid
        ;