我正在尝试标记重复记录,但是我对其中的一些记录进行了错误的重新分配,我不知道为什么。
数据:
= FirstName | LastName | Company | Group | Status | ID
x | x | x | NULL | NULL | 1
x | x | x | NULL | NULL | 2
然后我运行此查询以查找FirstName,LastName,Company上的匹配项 并将其连接回主表以标记记录:
with d as (
select ID, FirstName, LAstName, Company, row_number() over (partition by FirstName,LastName, Company order by FirstName,LastName, Company) as nr
from [dbo].xx)
Update b
set Status = 'S'
, Group = d.DQ_ID
from xx as b inner join d on
b.FirstName = d.FirstName and
b.LastNAme = d.LastName and
b.Company = d.Company
where d.nr = 1
然后用P
更新主记录Update b
set Status = 'P'
from xx as b
where b.ID = b.Group
GO
我的期望:
= FirstName | LastName | Company | Group | Status | ID
x | x | x | 1 | P | 1
x | x | x | 1 | S | 2
我得到了什么:
= FirstName | LastName | Company | Group | Status | ID
x | x | x | 2 | S | 1
x | x | x | 1 | S | 2
我正在制作大约1M的记录 - 而且只发生在其中一些记录上!
答案 0 :(得分:1)
试试这个:
;with d as (
select
ID,
FirstName,
LAstName,
Company,
row_number() over (
partition by FirstName,LastName, Company
order by Id asc -- this was done to keep ordering as per ID
) as nr
from [dbo].xx
) ,
e as
(select * from d where nr=1)
-- e was created to only take the nr=1 rows which will be joined to all similar records
Update b
set Status = case when e.DQ_ID = b.DQ_ID then 'P' else 'S' end
-- the set case logic ensures that matching ids get P else S
, Group = e.DQ_ID
from xx as b
inner join e on
b.FirstName = e.FirstName and
b.LastNAme = e.LastName and
b.Company = e.Company
答案 1 :(得分:1)
可以尝试使用以下内容:
;WITH RankedData AS
(
SELECT
T.ID,
T.[Group],
T.Status,
T.FirstName,
T.LastName,
T.Company,
GroupRanking = ROW_NUMBER() OVER (PARTITION BY T.FirstName, T.LastName, T.Company ORDER BY T.ID ASC)
FROM
dbo.xx AS T
)
UPDATE T SET
[Group] = N.ID,
Status = CASE WHEN T.GroupRanking = 1 THEN 'P' ELSE 'S' END
FROM
RankedData AS T
INNER JOIN RankedData AS N ON
T.FirstName = N.FirstName AND
T.LastName = N.LastName AND
T.Company = N.Company AND
N.GroupRanking = 1
请记住,INNER JOIN
将加入非空名称和公司,如果这些列上有空值,则必须记住。