Question

我在Oracle中有一个包含两列的表，我想查询包含唯一值组合的记录，而不管这些值的顺序如何。例如，如果我有下表：

create table RELATIONSHIPS (
    PERSON_1 number not null,
    PERSON_2 number not null,
    RELATIONSHIP  number not null,
    constraint PK_RELATIONSHIPS
        primary key (PERSON_1, PERSON_2)
);

我想查询所有独特的关系。因此，如果我有记录PERSON_1 = John和PERSON_2 = Jill，我不希望看到PERSON_1 = Jill和PERSON_2 = John的另一条记录。

有一种简单的方法吗？

Answer 1

两个方向的关系总是存在吗？即如果John和Jill有关系，那么总是一个{John，Jill}和{Jill，John}？如果是这样，那么只限于Person_1＆lt; Person_2并采取不同的设置。

Answer 2

select distinct
case when PERSON_1>=PERSON_2 then PERSON_1 ELSE PERSON_2 END person_a,
case when PERSON_1>=PERSON_2 then PERSON_2 ELSE PERSON_1 END person_b
FROM RELATIONSHIPS;

Answer 3

未测试：

select least(person_1,person_2)
     , greatest(person_1,person_2)
  from relationships
 group by least(person_1,person_2)
     , greatest(person_1,person_2)

要防止此类双重输入，您可以使用相同的想法添加唯一索引（已测试！）：

SQL> create table relationships
  2  ( person_1 number not null
  3  , person_2 number not null
  4  , relationship number not null
  5  , constraint pk_relationships primary key (person_1, person_2)
  6  )
  7  /

Table created.

SQL> create unique index ui_relationships on relationships(least(person_1,person_2),greatest(person_1,person_2))
  2  /

Index created.

SQL> insert into relationships values (1,2,0)
  2  /

1 row created.

SQL> insert into relationships values (1,3,0)
  2  /

1 row created.

SQL> insert into relationships values (2,1,0)
  2  /
insert into relationships values (2,1,0)
*
ERROR at line 1:
ORA-00001: unique constraint (RWIJK.UI_RELATIONSHIPS) violated

此致罗布。

Answer 4

您应该在Relationships表上创建约束，以使数字person_1值必须小于数字person_2值。

create table RELATIONSHIPS (
    PERSON_1 number not null,
    PERSON_2 number not null,
    RELATIONSHIP  number not null,
    constraint PK_RELATIONSHIPS
        primary key (PERSON_1, PERSON_2),
    constraint UNIQ_RELATIONSHIPS
        CHECK (PERSON_1 < PERSON_2)
);

这样你可以确定（2,1）永远不会被插入 - 它必须是（1,2）。然后你的PRIMARY KEY约束将防止重复。

PS：我看到Marc Gravell的答案比我更快，并采用类似的解决方案。

Answer 5

您是否希望阻止重复项插入数据库时存在一些不确定性。您可能只想获取唯一对，同时保留重复项。

所以这是后一种情况的替代解决方案，即使存在重复项，也要查询唯一对：

SELECT r1.*
FROM Relationships r1
LEFT OUTER JOIN Relationships r2
  ON (r1.person_1 = r2.person_2 AND r1.person_2 = r2.person_1)
WHERE r1.person_1 < r1.person_2
  OR  r2.person_1 IS NULL;

因此，如果匹配的行具有相反的id，那么查询应该优先考虑的一个规则（具有数字顺序id的那个）。

如果没有匹配的行，那么r2将为NULL（这是外连接的工作方式），所以只使用在这种情况下在r1中找到的任何内容。

无需使用GROUP BY或DISTINCT，因为只能有零个或一个匹配的行。

在MySQL中尝试这个，我得到以下优化计划：

+----+-------------+-------+--------+---------------+---------+---------+-----------------------------------+------+--------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref                               | rows | Extra                    |
+----+-------------+-------+--------+---------------+---------+---------+-----------------------------------+------+--------------------------+
|  1 | SIMPLE      | r1    | ALL    | NULL          | NULL    | NULL    | NULL                              |    2 |                          | 
|  1 | SIMPLE      | r2    | eq_ref | PRIMARY       | PRIMARY | 8       | test.r1.person_2,test.r1.person_1 |    1 | Using where; Using index | 
+----+-------------+-------+--------+---------------+---------+---------+-----------------------------------+------+--------------------------+

这似乎是对索引的合理使用。

Answer 6

我觉得这样的事情可以解决问题：

select * from RELATIONSHIPS group by PERSON_1, PERSON_2

Answer 7

我想我几乎把它弄好了，我添加了concat。

SELECT DISTINCT *
    FROM (SELECT DISTINCT concat(Person_1,Person_2) FROM RELATIONSHIPS
          UNION 
          SELECT DISTINCT concat(Person_2, Person_1) FROM RELATIONSHIPS
         ) dt

Answer 8

它很笨拙，但它至少会告诉你你有什么独特的组合，而不是真正方便的......

select distinct(case when person_1 <= person_2 then person_1||'|'||person_2 else person_2||'|'||person_1 end)
from relationships;

Answer 9

可能最简单的解决方案（不需要更改数据结构或创建触发器）是创建一组没有重复条目的结果，并将一个重复条目添加到该集合中。

看起来像是：

 select * from relationships where rowid not in 
    (select a.rowid from  relationships a,relationships b 
       where a.person_1=b.person_2 and a.person_2=b.person_1)
union all
 select * from relationships where rowid in 
    (select a.rowid from  relationships a,relationships b where 
       a.person_1=b.person_2 and a.person_2=b.person_1 and a.person_1>a.person_2)

但通常我从不创建没有单列主键的表。

Answer 10

你可以，

with rel as (
select *,
       row_number() over (partition by least(person_1,person_2), 
                                       greatest(person_1,person_2)) as rn
  from relationships
       )
select *
  from rel
 where rn = 1;

通过一组唯一的列值过滤SQL查询，无论其顺序如何

10 个答案: