我是SQL Server的新手,我正尝试从表中删除重复项,但有某些条件,我的疑问是如何将这些条件应用于查询。
我需要从Users
表中删除重复项,例如:
Id Code Name SysName
-----------------------------
1 D1 N1
2 D1
3 D1 N1 N-1
4 E2 N2
5 E2 N2
6 E2 N2
7 X3
8 X3 N-3
9
10
11 Z4 W2 N-4-4
12 Z4 W2 N-44
在上表中:对于D1代码,我想保留ID = 3,其中所有列均已填充(代码,名称和SysName),并删除ID = 1和ID = 2
对于E2代码,我想保留其中任何一个并删除两个重复的代码
对于X3代码,请保留SysName = N-3的代码
对于ID = 9,ID = 10(空代码,所有内容均为空,请删除所有内容)
对于Z4代码,删除ID = 11并保留N-44 Sysname
最后,我在其他表上有一个FK,所以我认为我首先需要从Users中获取所有ID,然后从第二个从属表中删除这些ID,最后从Users表中删除。
您是否知道如何实现?我不假装该解决方案,但可以使用结构代码或与之相似的示例/场景,任何建议对我来说都是很好的。
编辑:
要继续..我有“用户”表:
Id Code Name SysName
-----------------------------
1 D1 N1
2 D1
3 D1 N1 N-1
4 E2 N2
5 E2 N2
6 E2 N2
7 X3
8 X3 N-3
9
10
11 Z4 W2 N-4-4
12 Z4 W2 N-44
我只想保留:
Id Code Name SysName
-----------------------------
3 D1 N1 N-1
4 E2 N2
8 X3 N-3
12 Z4 W2 N-44
答案 0 :(得分:2)
您是否正在寻找类似的东西
SELECT Code,
MAX(ISNULL(Name, '')) Name,
MAX(ISNULL(SysName, '')) SysName
FROM T
WHERE Code IS NOT NULL
GROUP BY Code;
返回:
+------+------+---------+
| Code | Name | SysName |
+------+------+---------+
| D1 | N1 | N-1 |
| E2 | N2 | |
| X3 | | N-3 |
| Z4 | W2 | N-4-4 |
+------+------+---------+
答案 1 :(得分:1)
下一个查询显示根据以下重要规则要删除的ID列表:
1-如果用户所有字段均为空/空,则将被删除。
2-具有更多错误字段的用户将被首先删除(示例SysName不能包含两个-)。
3-具有更多字段为空/空的用户将被视为第一个删除用户。
;WITH
[Ids]
AS
(
SELECT
[U].[Id]
,[Importance] =
CASE
WHEN [X].[NumberOfFilledFields] = 0
THEN -1
ELSE ROW_NUMBER() OVER (PARTITION BY [U].[Code] ORDER BY [X].[NumberOfInvalidFields], [X].[NumberOfFilledFields] DESC)
END
FROM [Users] AS [U]
CROSS APPLY
(
SELECT
[NumberOfFilledFields] =
+ CASE WHEN NULLIF([U].[Code], '') IS NULL THEN 0 ELSE 1 END
+ CASE WHEN NULLIF([U].[Name], '') IS NULL THEN 0 ELSE 1 END
+ CASE WHEN NULLIF([U].[SysName], '') IS NULL THEN 0 ELSE 1 END
,[NumberOfInvalidFields] =
+ CASE WHEN [U].[SysName] LIKE '%-%-%' THEN 1 ELSE 0 END
) AS [X]
)
SELECT
[Id]
FROM [Ids]
WHERE (1 = 1)
AND ([Importance] = -1 OR [Importance] > 1);
答案 2 :(得分:1)
DEMO
(任何其他答案:请随时借用演示来测试您的答案或在您的答案中使用它!无需重复努力!)
可以使用诸如row_number()之类的分析函数/窗口函数为我们想要的每条记录分配一行,并保留所有#1行,但那些代码为空的行除外...使用cte执行此操作,然后删除。
我们通过查看数据最多的记录来确定要保留的内容,如果有联系,请使用最早的ID。
With cte as (
SELECT id, code, name, sysname,
row_number() over (partition by code order by (case when name is not null then 1 else 0 end + case when sysname is not null then 1 else 0 end) desc, ID) RN
FROM users)
Delete from cte where RN <> 1 or code is null;
结果:
+----+----+------+------+---------+
| | ID | Code | Name | Sysname |
+----+----+------+------+---------+
| 1 | 3 | D1 | N1 | N-1 |
| 2 | 4 | E2 | N2 | NULL |
| 3 | 8 | X3 | NULL | N-3 |
| 4 | 11 | Z4 | W2 | N-4-4 |
+----+----+------+------+---------+
一个人可以使用CTE并删除将被清除的相关FK记录 然后再次使用cte并删除用户
答案 3 :(得分:1)
这使用窗口函数并合并:
DECLARE @t TABLE ([Id] INT, [Code] CHAR(2), [Name] CHAR(2), [SysName] VARCHAR(10))
INSERT INTO @t values
(1 , 'D1', 'N1', Null ), (2 , 'D1', Null, Null ), (3 , 'D1', 'N1', 'N-1' ), (4 , 'E2', 'N2', Null ), (5 , 'E2', 'N2', Null ), (6 , 'E2', 'N2', Null )
, (7 , 'X3', Null, Null ), (8 , 'X3', Null, 'N-3' ) , (9 , Null, Null, Null ), (10, Null, Null, Null ), (11, 'Z4', 'W2', 'N-44'), (12, 'Z4', 'W2', 'N-44' )
;WITH t AS (
SELECT DISTINCT
[code]
, COALESCE([name], max([name]) OVER(PARTITION BY [code])) AS [Name]
, COALESCE([sysname], COALESCE(MAX([sysname]) OVER(PARTITION BY [code], [name]), MAX([sysname]) OVER(PARTITION BY [code]))) AS [SysName]
FROM @t
WHERE [code] IS NOT NULL)
SELECT MIN(t2.id), t.Code, t.Name, t.SysName
from @t t2
INNER JOIN t ON t.code = t2.code AND ISNULL(t.[Name], 'null') = ISNULL(t2.[Name], 'Null') AND ISNULL(t.[SysName], 'Null') = ISNULL(t2.[SysName], 'Null')
GROUP BY t.Code, t.Name, t.SysName
答案 4 :(得分:-1)
您需要了解案例 然后您可以相应地更改条件
您可以在下面看到示例代码。只需按照where子句中的要求扭曲大小写即可。
;with C as
(
select Dense_rank() over(partition by code order by id) as rn,*
from Users
)
delete from C
where rn =
(case
when (code = 'd1' and name is not null and sysname !='') then 0
when (code = 'E1' and rn = 1) then 0
when (code = 'X3' and sysname!='') then 0
when (code = 'z4' and name is not null and sysname !='') then 0
else rn
end )
输出:-
3 D1 N1 N-1
8 X3 N-3
11 Z4 W2 N-4-4
12 Z4 W2 N-44