合并表中的记录并更新相关表

时间:2018-09-20 20:28:25

标签: sql-server tsql

我有一个主表,其中包含链接到各种其他表的用户。有时由于错误的导入数据,该主表中存在重复项,我想将它们合并。请参阅下表。

表:用户

UserID    Username    FirstName    LastName
1         Main        John         Doe
2         Duplicate   John         Doo

表:记录1

RecordID  RecordName      CreatedUserID   UpdatedUserID
1         Test record 1   1               2
2         Test record 2   2               null
3         Test record 3   2               null

CreatedUserID和UpdatedUserID是Users.UserID的外部列。
因此,当前,如果我要合并用户1和2,可以使用以下SQL语句来完成此操作:

UPDATE Records1 SET UpdatedUserID = 1 WHERE UpdatedUserID = 2
UPDATE Records1 SET CreatedUserID = 1 WHERE CreatedUserID = 2
DELETE FROM Users WHERE UserID = 2

这只是一个示例子集,但实际上,有很多相关记录表,我必须为其添加其他SQL-Update语句。

我知道我可能会在这里碰运气,但是也许有一种方法可以完成上述操作(批量更新所有相关表并删除“重复”记录),而不是更新每个外部字段和每个相关表手动。用户表基本上是链接到所有其他表的基础表,因此为每个表创建单独的语句相当麻烦,因此,如果有快捷方式可用,那就太好了。

2 个答案:

答案 0 :(得分:0)

这有帮助吗??

    Create Table Users(Id int, UserName varchar(10),FirstName varchar(10), LastName Varchar(10))
    Create Table Records1(RecordID int,  RecordName varchar(20), CreatedUserID int,   UpdatedUserID int)


    INSERT INTO Users
    SELECT 1,'Main','John','Doe' Union All
    SELECT 2,'Duplicate','John','Doo' Union All

    SELECT 3,'Main3','ABC','MPN' Union All
    SELECT 4,'Duplicate','ABC','MPT' 


    Insert into Records1
    SELECT 1,'Test record 1',1,2    Union All
    SELECT 2,'Test record 2',2,null Union All
    SELECT 3,'Test record 3',2,null Union All

    SELECT 1,'Test record 1',3,4    Union All
    SELECT 2,'Test record 2',4,null Union All
    SELECT 3,'Test record 3',4,null 

    Select u1.Id as CreatedUserID,U2.id as UpdatedUserID
    Into #tmpUsers 
    from Users u1
    JOIN Users u2 
    --This Conidition Should be changed based on the criteria for identifying Duplicates
    on u1.FirstName=u2.FirstName and U2.UserName='Duplicate'
    Where u1.UserName<>'Duplicate'


    Update r
    Set r.UpdatedUserID=u.CreatedUserID
    From Records1 r
    JOIN #tmpUsers u on r.CreatedUserID=u.CreatedUserID

    Update r
    Set r.CreatedUserID=u.CreatedUserID
    From Records1 r
    JOIN #tmpUsers u on r.CreatedUserID=u.UpdatedUserID

    Delete from Users Where UserName='Duplicate'

    Select * from Users
    Select * from Records1


    Drop Table #tmpUsers

答案 1 :(得分:0)

由于识别重复帐户的过程将是手动的,因此(通常)将有成对的帐户要处理。 (我假设Inspector不能在您的UI中剔除15个用户帐户作为重复项,而是将全部批次提交进行处理。)

如下所示的存储过程可能是一个好的开始:

create procedure MergeUsers
  @RetainedUserId Int, -- UserId   that is being kept.
  @VictimUserId Int -- UserId   that is to be removed.

as
  begin

  -- Validate the input.
  --   Optional, but you may want some reality checks.
  --   (Usernames are probably unique already, eh?)
  declare @UsernameMatch as Int, @FirstNameMatch as Int, @LastNameMatch as Int, @EmailMatch as Int;
  select
    @UsernameMatch = case when R.Username = V.Username then 1 else 0 end,
    @FirstNameMatch = case when R.FirstName = V.FirstName then 1 else 0 end,
    @LastNameMatch = case when R.LastName = V.LastName then 1 else 0 end,
    @EmailMatch = case when R.Email= V.Emailthen 1 else 0 end
    from Users as R inner join
      Users as V on V.UserId = @VictimUserId and R.UserId = @RetainedUserId;
  if @UsernameMatch + @FirstNameMatch + @LastNameMatch + @EmailMatch < 2
    begin
    -- The following message should be enhanced to provide a better clue as to which user
    --   accounts are being processed and what did or didn't match.
    RaIsError( 'MergeUsers: The two user accounts should have something in common.', 25, 42 );
    return;
    end;

  -- Update all of the related tables.
  --   Using a single pass through each table and updating all of the appropriate columns may improve performance.
  --   The   case   expression will only alter the values which reference the victim user account.
  update Records1
    set
      CreatedUserId = case when CreatedUserId  = @VictimId then @RetainedUserId else CreatedUserId end,
      UpdatedUserId = case when UpdatedUserId = @VictimId then @RetainedUserId else UpdatedUserId end
    where CreatedUserId = @VictimUserId or UpdatedUserId = @VictimUserId;

  update Records2
    set ...
    where ...;

  -- Houseclean   Users .
  delete from Users
    where UserId = @VictimUserId;

  end;

注意事项:在练习中向左添加try / catch和SP中的事务以确保合并是全有或全无操作。