我在表格中有重复的记录。我需要能够只识别一个唯一标识符,以便我可以从表中删除它。
我知道有一个重复的唯一方法是来自subject
和description
列,所以如果至少有两个相同的主题和相同的描述,我需要删除一个并留下一个
所以我能够获得重复记录的列表,但是我无法获得唯一标识符以便能够删除它。
这是我为识别重复记录所做的工作。
SELECT
p.accountid, p.subject, p.description, count(*) AS total
FROM
activities AS p
WHERE
(p.StateCode = 1) AND p.createdon >= getdate()-6
GROUP BY
p.accountid, p.subject, p.description
HAVING
count(*) > 1
ORDER BY
p.accountid
有一列record_id
,其中包含每条记录的唯一标识符。但是如果我将record_id
添加到我的select语句中,那么我就没有结果,因为不可能有重复的唯一标识符
如何使用SQL Server获取record_id
?
注意:record_id不是整数,类似于“D32B275B-0B2F-4FF6-8089-00000FDA9E8E”
由于
答案 0 :(得分:4)
我喜欢SQL Server的一个很好的功能是使用带有update
和delete
语句的CTE。
您正在寻找重复记录,并且可能希望保留最低或最高record_id
。您可以获取计数和id以继续使用CTE和窗口函数:
with todelete as (
SELECT p.accountid, p.subject, p.description,
COUNT(*) over (partition by p.accountid, p.subject, p.description) as total,
MIN(record_id) over (partition by p.accountid, p.subject, p.description) as IdToKeep
FROM activities AS p
WHERE (p.StateCode = 1) AND p.createdon >= getdate()-6
)
delete from todelete
where total > 1 and record_id <> IdToKeep;
最后的where
子句只使用逻辑来选择要删除的正确行。
我应该补充一点,如果您只想要删除的列表,可以使用类似的查询:
with todelete as (
SELECT p.accountid, p.subject, p.description,
COUNT(*) over (partition by p.accountid, p.subject, p.description) as total,
MIN(record_id) over (partition by p.accountid, p.subject, p.description) as IdToKeep
FROM activities AS p
WHERE (p.StateCode = 1) AND p.createdon >= getdate()-6
)
select *
from todelete
where total > 1 and record_id <> IdToKeep;
over
函数表示函数正被用作窗口函数。这个想法很简单。 Count(*) over
返回partition
子句中具有相同值的所有记录的计数。它与聚合函数非常相似,只不过你在每一行都得到了值。这类功能非常强大,我建议您了解更多相关信息。
答案 1 :(得分:0)
也许是这样的?
SELECT max(p.record_id), p.accountid, p.subject, p.description, count(*) AS total
FROM activities AS p
WHERE (p.StateCode = 1) AND p.createdon >= getdate()-6
GROUP BY p.accountid, p.subject, p.description
HAVING count(*) > 1
ORDER BY p.accountid
答案 2 :(得分:0)
对我来说,你需要先做一个内部查询,然后加入更大的表来获得你想要的东西。
SELECT ALL
*
FROM (SELECT p.accountid
FROM activities AS p
WHERE p.statecode = 1 AND p.createdon >= getdate()-6
GROUP BY p.accountid
HAVING count(*) > 1) AS x
JOIN activities AS a ON x.accountid = a.accountid
ORDER BY p.accountid
答案 3 :(得分:0)
试试这个:
;with recordsToDelete as (
SELECT
recordId
,Row_Number() OVER(partition p.subject, p.description) as rowNum
FROM activities AS p
)
select
*
from recordsToDelete
where rowNum > 1
如果看起来正确,您可以将选择替换为:
delete from recordsToDelete
where rowNum > 1