从SQL Server中删除重复行(基于多列的值)

时间:2018-01-27 22:53:48

标签: sql sql-server

+--------------+------------+-------------------+------------+
| id           | DATA_ID    |   SourceID        | DateAdded  |
+--------------+------------+-------------------+------------+
|           1  | 304019032  | 1                 |  2018-01-20|
|           2  | 345583556  | 1                 |  2018-01-21|
|           3  | 299951717  | 3                 |  2018-01-23|
|           4  | 304019032  | 2                 |  2018-01-24|
|           5  | 298519282  | 2                 |  2018-01-24|
|           6  | 299951717  | 3                 |  2018-01-27|
|           7  | 345583556  | 1                 |  2018-01-27|
+--------------+------------+-------------------+------------+

我尝试删除具有相同DATA_ID和SourceID的重复行,并使该行保留最新日期。

2 个答案:

答案 0 :(得分:2)

在SQL Server中,我喜欢使用row_number()和可更新的CTE:

with todelete as (
      select t.*, row_number() over (partition by sourceid, dataid order by dateadded desc) as seqnum
      from t
     )
delete from todelete
    where seqnum > 1;

即使sourceid / dataidNULL(这可能不是您的情况),这也会有效。它也不假设dateaddedid相互增加(尽管这可能是一个合理的假设)。

答案 1 :(得分:0)

delete from your_table
where id not in 
(
  select max(id)
  from your_table
  group by data_id, sourceID
)
相关问题