Question

我有一个表，其中包含一个具有唯一ID的列和一个包含每个唯一ID的配偶ID的列（如果他们有配偶）。问题是每个配偶ID也出现在唯一ID列中，所以当我拉一个列表，试图将一对夫妇当作一个单元时，我经常会为一对夫妇进行双重计数。

获取给定的唯一ID列表，检查他们的配偶是否也在同一个唯一ID列表中，并且每对夫妇只返回一个唯一ID，这是一种好的，有效的方法吗？

这个问题稍微复杂一点，因为有时配偶双方都不在同一个名单中，所以这不仅仅是一个人如果结婚了。如果配偶不在同一个列表中，我想确保保留配偶。我还想确保我保留配偶ID列中具有NULL值的所有人。

相关表格的子集：

Unique_ID      Spouse_ID
    1              2
    2              1
    3             NULL
    4             NULL
    5              10
    6              25
    7             NULL
    8              9
    9              8
   10              5

在这段摘录中，ID的3,4和7都是单身。 ID的1,2,5,8和9的配偶出现在Unique_ID列中。 ID 6有一个配偶，其ID不会出现在Unique_ID列中。所以，我想保持ID为1（或2），3,4,5（或10），6,7和8（或9）。希望这是有道理的。

Answer 1

我倾向于将两个列表合并并删除重复项：

select distinct id
from ((select id
       from t
      ) union all
      (select spouse_id
       from t
       where spouse_id in (select id from t)
      )
     ) t

但是，你的问题要求有效的方法。考虑这个的另一种方法是添加一个新列，如果在id列表中是配偶id，否则为NULL（这使用left outer join。然后有三种情况：

没有配偶ID，所以请使用id
id小于原始ID。使用它。
配偶ID小于原始ID。丢弃此记录，因为正在使用原件。

这是表达这一点的明确方式：

select IdToUse
from (select t.*, tspouse.id tsid,
             (case when tspouse.id is null then t.id
                   when t.id < tspouse.id then t.id
                   else NULL
              end) as IdToUse
      from t left outer join
           t tspouse
           on t.spouse_id = tspouse.id
     ) t
where IdToUse is not null;

您可以将其简化为：

  select t.*, tspouse.id tsid,
         (case when tspouse.id is null then t.id
               when t.id < tspouse.id then t.id
               else NULL
          end) as IdToUse
  from t left outer join
       t tspouse
       on t.spouse_id = tspouse.id
  where tspouse.id is null or
        t.id < tspouse.id

Answer 2

两张桌子只是简单的坏设计
合并表格

select id 
from table 
where id < spouseID
   or spouseID is null

保留一个重复实例出现在两列中的一列中

2 个答案: