我正在尝试编写一个查询来删除下表(valid_columns)中的重复记录,并仅保留具有最低[订单]编号的记录。
例如,在下表中,我想删除重复的行,区域2,3和作业3,并使记录尽可能低[订单]。
E.g。输入表,valid_columns如下所示:
name col_order
-------------
job 1
job 3
status 2
cust 2
county 1
state 1
region 1
region 2
region 3
so 4
期望的输出:
name col_order
-------------
job 1
status 2
cust 2
county 1
state 1
region 1
so 4
我正在尝试修复错误,但我无法弄清楚SQL。目前它使用删除语句和子查询。目前使用的查询如下所示:
- 3)删除重复的列
DELETE
FROM valid_columns
WHERE NOT ( col_order = ( SELECT TOP 1 col_order
FROM valid_columns firstValid
WHERE name = firstValid.name
AND col_order = firstValid.col_order
ORDER BY col_order ASC ))
但是,这只会返回以下内容,这是不正确的:
name col_order
-------------
job 1
county 1
state 1
region 1
非常感谢
答案 0 :(得分:1)
-- Test table
declare @T table(Name varchar(10), col_order int)
-- Sample data
insert into @T
select 'job', 1 union all
select 'job', 3 union all
select 'status', 2 union all
select 'cust', 2 union all
select 'county', 1 union all
select 'state', 1 union all
select 'region', 1 union all
select 'region', 2 union all
select 'region', 3 union all
select 'so', 4
-- Delete using CTE and row_number()
;with cte as
(
select row_number() over(partition by Name order by col_order) as rn
from @T
)
delete from cte
where rn > 1
-- Result
select *
from @T
或使用子查询而不是CTE
delete vc
from (select row_number() over(partition by Name order by col_order) as rn
from valid_columns) as vc
where vc.rn > 1
答案 1 :(得分:1)
DELETE FROM t1
FROM valid_columns t1
WHERE col_order >
(SELECT MIN(col_order) from valid_columns t2 WHERE t1.name = t2.name)
修改强> 可简化为:
DELETE FROM valid_columns
WHERE col_order >
(SELECT MIN(col_order) from valid_columns t2 WHERE valid_columns.name = t2.name)
DELETE语句可以使用FROM子句根据第二个表中相关记录的值删除记录。在这种情况下,实际上并不需要FROM(我有时使用FROM来为表名设置别名,因为我不喜欢额外的输入。)
DELETE FROM TableA
FROM TableA
JOIN TableB On TableA.CriteriaA = TableB.CriteriaA
你也可以尝试这个例子(如果你必须这么做的话可能会更快):
DELETE FROM valid_columns
WHERE EXISTS
(SELECT * FROM valid_columns t1
WHERE t1.name = valid_columns.name AND valid_columns.col_order > t1.col_order);
答案 2 :(得分:0)
试试这个(你可以用select取代删除,以确保在删除之前得到正确的结果)。
DELETE FROM [valid_columns] t1
WHERE col_order > (SELECT MIN(col_order) from [valid_columns] t2
WHERE t1.name = t2.name)
答案 3 :(得分:0)
这应该做你需要的:
DELETE FROM valid_columns a
WHERE (SELECT MAX(col_order)
FROM valid_columns b
WHERE a.name = b.name) > a.col_order;
我建议您在测试之前备份数据。
答案 4 :(得分:0)
或者您可以使用游标遍历表并插入临时表中项目遇到的第一个值(确保临时表具有为名称列指定的唯一约束)。
编辑:为方便起见,我添加了一段代码段
declare @Ti table(name varchar(10), col_order int);
declare @Tf table(name varchar(10) unique not null, col_order int not null);
declare @name varchar(10);
declare @col_order int;
-- Sample data
insert into @Ti
select 'job', 1 union all
select 'job', 3 union all
select 'status', 2 union all
select 'cust', 2 union all
select 'county', 1 union all
select 'state', 1 union all
select 'region', 1 union all
select 'region', 2 union all
select 'region', 3 union all
select 'so', 4
select * from @Ti
declare i cursor for
select * from @Ti;
open i;
fetch next from i into @name, @col_order;
while @@FETCH_STATUS = 0
begin
if not exists( select * from @Tf where name = @name )
begin
insert into @Tf(name, col_order)
select @name, @col_order;
end
fetch next from i into @name, @col_order;
end
close i;
deallocate i;
select * from @Tf;
答案 5 :(得分:0)
使用二进制校验和删除记录(这适用于任何sql server版本)
CREATE TABLE #t1(ID INT NULL, VALUE VARCHAR(2)) INSERT INTO #t1(ID, VALUE) VALUES (1,'aa') INSERT INTO #t1(ID, VALUE) VALUES (2,'bb') INSERT INTO #t1(ID, VALUE) VALUES (1,'aa') INSERT INTO #t1(ID, VALUE) VALUES (1,'aa') INSERT INTO #t1(ID, VALUE) VALUES (3,'cc') INSERT INTO #t1(ID, VALUE) VALUES (3,'cc') GO -- BINARY_CHECKSUM(): are columns that we want to compare duplicates for -- if you want to compare the full row then change BINARY_CHECKSUM() -> BINARY_CHECKSUM(*) -- for SQL Server 2000+ a loop -- save checksums and rowcounts for duplicates SELECT BINARY_CHECKSUM(ID, VALUE) AS ChkSum, COUNT(*) AS Cnt INTO #t2 FROM #t1 GROUP BY BINARY_CHECKSUM(ID, VALUE) HAVING COUNT(*)>1 DECLARE @ChkSum BIGINT, @rc INT -- get the first checksum and set the rowcount to the count - 1 -- because we want to leave one duplicate SELECT TOP 1 @ChkSum = ChkSum, @rc = Cnt-1 FROM #t2 WHILE EXISTS (SELECT * FROM #t2) BEGIN -- rowcount is one less than the duplicate rows count SET ROWCOUNT @rc DELETE FROM #t1 WHERE BINARY_CHECKSUM(ID, VALUE) = @ChkSum -- remove the processed duplicate from the checksum table DELETE #t2 WHERE ChkSum = @ChkSum -- select the next duplicate rows to delete SELECT TOP 1 @ChkSum = ChkSum, @rc = Cnt-1 FROM #t2 END SET ROWCOUNT 0 GO SELECT * FROM #t1 -- for SQL Server 2005+ a cool CTE ;WITH Numbered AS ( SELECT ROW_NUMBER() OVER (PARTITION BY ChkSum ORDER BY ChkSum) AS RN, * FROM ( SELECT BINARY_CHECKSUM(ID, VALUE) AS ChkSum FROM #t1 ) t ) DELETE FROM Numbered WHERE RN > 1; GO SELECT * FROM #t1 DROP TABLE #t1; DROP TABLE #t2;