我一直在反对这一段时间,现在已经无处可去;数据必须保持在行级别。
我希望保留最早到达的数据,重复数据有效。 Load1表示batchID。并非所有值都有重复
我想要回归
Code1 Code2 Code3 Load1 LoadTime
a1 a1 a1 1 2013-09-10
a1 a1 a1 1 2013-09-10
a1 a1 a1 1 2013-09-10
a2 a1 a1 2 2013-09-12
a1 a2 a1 3 2013-09-13
a1 a2 a1 3 2013-09-13
有什么建议吗?
CREATE TABLE #Test (
Code1 varchar(10),
Code2 varchar(10),
Code3 varchar(10),
Load1 varchar(10),
LoadTime DATE
)
INSERT INTO #Test
VALUES ('a1','a1','a1','1','2013-09-10') --Keep
INSERT INTO #Test
VALUES ('a1','a1','a1','1','2013-09-10') --Keep
INSERT INTO #Test
VALUES ('a1','a1','a1','1','2013-09-10') --Keep
INSERT INTO #Test
VALUES ('a1','a1','a1','2','2013-09-11') --Delete
INSERT INTO #Test
VALUES ('a2','a1','a1','2','2013-09-12') --Keep
INSERT INTO #Test
VALUES ('a2','a1','a1','3','2013-09-13') --Delete
INSERT INTO #Test
VALUES ('a1','a2','a1','3','2013-09-13') --Keep
INSERT INTO #Test
VALUES ('a1','a2','a1','3','2013-09-13') --Keep
INSERT INTO #Test
VALUES ('a1','a2','a1','4','2013-09-13')-- Delete
INSERT INTO #Test
VALUES ('a1','a2','a1','4','2013-09-13')-- Delete
答案 0 :(得分:0)
您可能希望查看row_number()
或dense_rank()
很难说出删除或保留样本数据的逻辑,但类似
;with cte as (
select *,
dense_rank() over (partition by code1,code2,code3 order by loadtime) rn
from #test)
delete #Test
from #Test t
inner join cte
on t.Code1 = cte.Code1
and t.Code2 = cte.Code2
and t.Code3 = cte.Code3
and t.Load1 = cte.Load1
and t.LoadTime = cte.LoadTime
where rn>1
(如果您的数据具有唯一ID,则连接会更容易)
答案 1 :(得分:0)
您可以使用SQL Server common table expression or CTE:
with cte as (
select
dense_rank() over(partition by Code1, Code2, Code3 order by LoadTime, Load1 asc) as rn
from Table1
)
delete from cte where rn > 1
<强> sql fiddle demo 强>
实际上这个查询在SQL Server中非常简单,因为SQL Server将简单的公用表表达式视为可更新视图 - 您不必在原始表上加入cte,只需delete from cte