使用SQL Server在表中查找重复记录

时间:2012-03-24 07:08:27

标签: sql sql-server sql-server-2005

我正在验证一张包含电子商务网站交易级别数据的表格,并找到确切的错误。

我希望您的帮助能够在SQL Server上的50列表中找到重复记录。

假设我的数据是:

OrderNo shoppername amountpayed city Item       
1       Sam         10          A    Iphone
1       Sam         10          A    Iphone--->>Duplication to be detected
1       Sam         5           A    Ipod
2       John        20          B    Macbook
3       John        25          B    Macbookair
4       Jack        5           A    Ipod

假设我使用以下查询:

Select shoppername,count(*) as cnt
from dbo.sales
having count(*) > 1
group by shoppername

会回复我

Sam  2
John 2

但我不想发现重复超过1或2列。我想在我的数据中找到所有列的副本。我希望结果为:

1       Sam         10          A    Iphone

13 个答案:

答案 0 :(得分:62)

with x as   (select  *,rn = row_number()
            over(PARTITION BY OrderNo,item  order by OrderNo)
            from    #temp1)

select * from x
where rn > 1

您可以通过

替换select语句来删除重复项
delete x where rn > 1

答案 1 :(得分:40)

SELECT OrderNo, shoppername, amountPayed, city, item, count(*) as cnt
FROM dbo.sales
GROUP BY OrderNo, shoppername, amountPayed, city, item
HAVING COUNT(*) > 1

答案 2 :(得分:4)

SQL> SELECT JOB,COUNT(JOB) FROM EMP GROUP BY JOB;

JOB       COUNT(JOB)
--------- ----------
ANALYST            2
CLERK              4
MANAGER            3
PRESIDENT          1
SALESMAN           4

答案 3 :(得分:3)

只需将所有字段添加到查询中,并记住也将它们添加到Group By中。

Select shoppername, a, b, amountpayed, item, count(*) as cnt
from dbo.sales
group by shoppername, a, b, amountpayed, item
having count(*) > 1

答案 4 :(得分:3)

要获取多个记录的列表,请使用以下命令

select field1,field2,field3, count(*)
  from table_name
  group by field1,field2,field3
  having count(*) > 1

答案 5 :(得分:1)

试试这个

SELECT MAX(shoppername), COUNT(*) AS cnt
FROM dbo.sales
GROUP BY CHECKSUM(*)
HAVING COUNT(*) > 1

首先阅读CHECKSUM函数,因为可能存在重复项。

答案 6 :(得分:0)

with x as (
select shoppername,count(shoppername)
              from sales
              having count(shoppername)>1
            group by shoppername)
select t.* from x,win_gp_pin1510 t
where x.shoppername=t.shoppername
order by t.shoppername

答案 7 :(得分:0)

首先,我怀疑结果不准确?好像原来的桌子里有三个'山姆'。但这个问题并不重要。

然后我们来问问自己。根据您的表格,显示重复值的最佳方法是使用count(*)Group by子句。查询看起来像这样

SELECT OrderNo, shoppername, amountPayed, city, item, count(*) as RepeatTimes FROM dbo.sales GROUP BY OrderNo, shoppername, amountPayed, city, item HAVING COUNT(*) > 1

原因是您表中的所有列都唯一地标识了每条记录,这意味着只有当每列中的所有值完全相同时,记录才会被视为重复记录,您还希望显示重复记录的所有字段,所以group by不会错过任何列,否则是,因为您只能select列参与'group by'子句。

现在我想给你With...Row_Number()Over(...)的任何一个例子,它使用表表达式和Row_Number函数。

假设您有一个几乎相同的表但有一个名为发货日期的额外列,并且该值可能会更改,即使其余列都相同。这是:

OrderNo shoppername amountpayed city Item Shipping Date
1 Sam 10 A Iphone 2016-01-01 1 Sam 10 A Iphone 2016-02-02 1 Sam 5 A Ipod 2016-03-03 2 John 20 B Macbook 2016-04-04 3 John 25 B Macbookair 2016-05-05 4 Jack 5 A Ipod 2016-06-06

请注意,如果您仍将所有列作为一个单元,则第2行不是重复行。但是如果你想在这种情况下将它们视为重复呢?您应该使用With...Row_Number()Over(...),查询将如下所示:

WITH TABLEEXPRESSION AS (SELECT *,ROW_NUMBER() OVER (PARTITION BY OrderNo, shoppername, amountPayed, city, item ORDER BY [Shipping Date] as Identifier) --if you consider the one with late shipping date as the duplicate FROM dbo.sales) SELECT * FROM TABLEEXPRESSION WHERE Identifier !=1 --or use '>1'

以上查询将结果与发货日期一起给出结果,例如:

OrderNo shoppername amountpayed city Item Shipping Date Identifier 1 Sam 10 A Iphone 2016-02-02 2

请注意,这个与2016-01-01不同,2016-02-02过滤掉的原因是PARTITION BY OrderNo, shoppername, amountPayed, city, item ORDER BY [Shipping Date] as Identifier,而发货日期不是需要的列之一请注意重复记录,这意味着2016-02-02的记录仍然是您问题的完美结果。

现在总结一点,当你只想显示count(*)子句中的所有列作为结果时,同时使用Group byGroup by子句是最好的选择,否则你会错过不参与group by

的列

虽然对于With...Row_Number()Over(...),它适用于您想要查找重复记录的每个场景,但是,与前者相比,编写查询并且设计有点过于复杂。

如果您的目的是从表中删除重复记录,则必须使用后面的WITH...ROW_NUMBER()OVER(...)...DELETE FROM...WHERE

希望这有帮助!

答案 8 :(得分:0)

试试这个

with T1 AS
(
SELECT LASTNAME, COUNT(1) AS 'COUNT' FROM Employees GROUP BY LastName HAVING  COUNT(1) > 1
)
SELECT E.*,T1.[COUNT] FROM Employees E INNER JOIN T1 ON T1.LastName = E.LastName

答案 9 :(得分:0)

您可以使用以下方法查找输出

 with Ctec AS
 (
select *,Row_number() over(partition by name order by Name)Rnk
 from Table_A
)
select  Name from ctec
where rnk>1

select name from Table_A
 group by name
 having count(*)>1

答案 10 :(得分:-2)

选择* 来自dbo.sales 以shoppername分组 有(count(Item)> 1)

答案 11 :(得分:-2)

选择EventID,将()计为cnt 来自dbo.EventInstances 按EventID分组 有计数()> 1

答案 12 :(得分:-2)

以下是运行代码:

SELECT abnno, COUNT(abnno)
FROM tbl_Name
GROUP BY abnno
HAVING ( COUNT(abnno) > 1 )