SQL过程:删除重复项,重新分配外键引用

时间:2017-01-31 15:00:36

标签: sql sql-server

我在一家处理排名竞争的公司工作。

不幸的是,他们的会员表对电子邮件没有独特的限制,一些用户为他们所在的每个种族或团队创建了一个新帐户,并使用相同的电子邮件。

我想在列上放置一个唯一约束,以防止将来出现任何重复,但是......

问题:如何在不丢失与其相关的数据的情况下,使用单个查询删除重复项

我认为它与更新所有外键以匹配用户的一个实例然后删除重复项有关。

澄清: 在下面的示例中,标记的行指的是具有ID< 03,04,05和06的重复成员。 在这种情况下,解决方案将是:

  1. ID为03和05的外键引用更改为01。
  2. ID为04和06的外键引用更改为02。
  3. 删除ID为03,04,05和06的重复成员。
  4. 但是如何在MSSQL中完成?

    Member table
    ID | Username | Gender | Email
    01 | User1    | Male   | fake@fu.bar
    02 | User2    | Female | alsofake@fu.bar
    *03 | User3    | Male   | fake@fu.bar
    *04 | User4    | Female | alsofake@fu.bar
    *05 | User5    | Male   | fake@fu.bar
    *06 | User6    | Female | alsofake@fu.bar
    
    
    MemberToTeam table
    MemberID_fk | TeamID_fk
    01          | 01
    02          | 01
    *03          | 02
    *04          | 02
    *05          | 03
    *06          | 03
    
    RaceRank table
    RaceID_fk | MemberID_fk | Ranking
    01        | 01          | 12
    01        | 02          | 1
    *02        | 03          | 5
    *02        | 04          | 7
    *03        | 05          | 4
    *03        | 06          | 9
    

    感谢您的帮助。

3 个答案:

答案 0 :(得分:2)

这在一个查询中完成。重复另一个表。

with FAKES as
(
select Email
from Member
group by Email
having count(id) >1
),
FAKE_ID as
(
select id, email, row_number() over(partition by email order by id) as c_id
from Member
where email in (select Email from FAKES)
)
,
DEDUP as
(
select fi.id, f2.id as val_id
from FAKE_ID fi
inner join FAKE_ID f2
  on fi.email = f2.email
where fi.c_id > 1
and f2.c_id = 1
)
update mt
set mt.MemberID_fk = dd.val_id
from MemberToTeam mt
inner join DEDUP dd
on dd.id = mt.MemberID_fk;

经过测试here

答案 1 :(得分:2)

此代码将解决问题

--MemberToTeam
;with cte_dupes as
(
select ID,Email,
    row_number() over (partition by Email order by Email) rn
from Member 
)
update mt
    set MemberID_fk = (select cte.ID from cte_dupes cte where rn=1 and cte.Email = m.Email)
from MemberToTeam mt
inner join Member m on m.ID = mt.MemberID_fk
inner join cte_dupes cte on cte.ID = mt.MemberID_fk and cte.rn > 1;


--RaceRank
;with cte_dupes as
(
select ID,Email,
    row_number() over (partition by Email order by Email) rn
from Member 
)
update r
    set MemberID_fk = (select cte.ID from cte_dupes cte where rn=1 and cte.Email = m.Email)
from RaceRank r
inner join Member m on m.ID = r.MemberID_fk
inner join cte_dupes cte on cte.ID = r.MemberID_fk and cte.rn > 1;

答案 2 :(得分:0)

您可能必须更新通过外键链接到成员表的每个其他表。

您可以在所有共享相同电子邮件地址的记录中选择要在成员表中依赖的单个记录,然后使用如下查询更新链接表:

update myreferencetable set memberid = [the single instance of the member] 
where memberid in (select memberid from member where email = [email address with duplicates]