SQL Server重复数据删除优先级

时间:2018-04-16 16:05:03

标签: sql-server duplicates

我有一个名为Transactions的事务表,如下所示:

    name    status
------------------------
    harry   available
    harry   expired
    harry   cancelled
    barry   available
    sally   expired
    sally   available
    jane    available
    jane    used
    nelly   available

我想删除表格中的行,优先放置那些“已过期”,“已取消”或“已使用”超过“可用”的行。所以,如果我重复删除表格,它将如下所示:

    name    status
-------------------------
    harry   expired
    harry   cancelled
    barry   available
    sally   expired
    jane    used
    nelly   available

我将如何做到这一点?

提前致谢

编辑:

我尝试创建两个表,然后像这样加入它们:

/****** remove available ******/
WITH CTE AS(SELECT [id]
      ,[name]
      ,[status]
      ,RN = ROW_NUMBER()OVER(PARTITION BY [id]
      ,[name])
  FROM [transactions])
  DELETE FROM CTE WHERE RN > 1 AND [status] not in('used','cancelled','expired')

/****** available only  ******/
WITH CTE AS(SELECT [id]
      ,[name]
      ,[status]
      ,RN = ROW_NUMBER()OVER(PARTITION BY [id]
      ,[name])
  FROM [transactions_2])
  DELETE FROM CTE WHERE RN > 1 AND [status] not in('used','cancelled','expired')

然后我会联合[交易]& [transactions_2],但那不起作用......

2 个答案:

答案 0 :(得分:1)

带窗口功能的


这是选择 - 而不是删除

declare @T table (name varchar(10), avail varchar(10));
insert into @T values 
    ('harry', 'available')
  , ('harry', 'expired')
  , ('harry', 'cancelled')
  , ('barry', 'available')
  , ('sally', 'expired')
  , ('sally', 'available')
  , ('jane', 'available')
  , ('jane', 'used')
  , ('nelly', 'vailable');

select * 
from ( select * 
            , count(*) over (partition by t.name) as cnt
       from @T t
     ) tt
where tt.avail in ('expired', 'cancelled', 'used')
   or tt.cnt = 1 
order by tt.name, tt.avail;

删除

delete tt
from ( select * 
            , count(*) over (partition by t.name) as cnt
       from @T t
     ) tt
where tt.avail not in ('expired', 'cancelled', 'used')
  and tt.cnt > 1;

select * 
from @T 
order by name, avail;

答案 1 :(得分:1)

您可以尝试分两步完成。

  1. 删除状态和名称相同的“重复”
  2. 删除存在其他状态记录的“可用”记录
  3. delete from transactions t1 where exists (select 1 from transactions t2 where t2.name = t1.name and t2.status = t1.status and t2.id > t1.id)
    delete from transactions t1 where t1.status = 'available' and exists (select 1 from transactions t2 where t2.name = t1.name and t2.status in ( 'expired', 'cancelled' , 'used') )