如何从此Sql Server表中删除重复项?

时间:2010-11-08 22:13:45

标签: sql sql-server-2008 duplicates

我在Sql Server 2008 r2 DB中有一个表。每隔几秒钟我就会将数据导入此表。有一次,导入失败,因此它不断导入相同的数据,创建重复数据。 (基本上,如果导入读取20行,导入19并且在20上失败..则那些19不在事务中......因此被插入)。

无论如何,我试图找出如何删除所有重复项以及第一个(原始)插入行?

这是表格架构 - 请注意,有一些可以为空的字段。

CREATE TABLE [dbo].[LogEntries](
    [LogEntryId] [int] IDENTITY(1,1) NOT NULL,
    [GameFileId] [int] NOT NULL,
    [CreatedOn] [datetimeoffset](7) NOT NULL,
    [EventTypeId] [tinyint] NOT NULL,
    [Message] [nvarchar](max) NULL,
    [Code] [int] NULL,
    [Violation] [nvarchar](100) NULL,
    [ClientName] [nvarchar](100) NULL,
    [ClientGuid] [nvarchar](50) NULL,
    [ClientGuidReversed] [nvarchar](50) NULL,
    [ClientIpAndPort] [nvarchar](50) NULL,
 CONSTRAINT [PK_LogEntries] PRIMARY KEY CLUSTERED 
(
    [LogEntryId] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

干杯:)

更新:什么是重复项(在这种情况下?)

该死的抱歉。忘了定义副本。 LogEntryId是唯一的,因此请忽略该信息(它未导入)。导入所有其余数据。这是两行相同的数据。

6459749 39  2010-11-05 00:00:25.0000000 +11:00  6   Violation (MULTIHACK) #70805    70805   MULTIHACK   angelb  aeda202c22ed41f7301d0673647c55d8    8d55c7463760d1037f14de22c202adea    220.246.157.194:57133
6459766 39  2010-11-05 00:00:25.0000000 +11:00  6   Violation (MULTIHACK) #70805    70805   MULTIHACK   angelb  aeda202c22ed41f7301d0673647c55d8    8d55c7463760d1037f14de22c202adea    220.246.157.194:57133

并将其与desc

排序的前5名进行比较
6505931 40  2010-11-08 23:39:16.0000000 +11:00  4   NULL    NULL    NULL    Zaphrolio   69ae1bfea616c244e5c223e51d5ceb8e    e8bec5d15e322c5e442c616aefb1ea96    175.38.209.80:10000
6505930 39  2010-11-08 23:39:04.0000000 +11:00  3   NULL    NULL    NULL    imBakedAsBro    8cf1b3b6a389229fa4adeec07dc087ce    ec780cd70ceeda4af922983a6b3b1fc8    110.175.83.45:10000
6505929 39  2010-11-08 23:39:03.0000000 +11:00  2   NULL    NULL    NULL    imBakedAsBro    NULL    NULL    110.175.83.45:10000
6505928 80  2010-11-08 23:39:04.0000000 +11:00  4   NULL    NULL    NULL    Asmo74  5ccf5ee85a6cf08da563bdcbfe75351d    d15357efbcdb365ad80fc6a58ee5fcc5    61.68.212.231:50273
6505927 80  2010-11-08 23:39:03.0000000 +11:00  4   NULL    NULL    NULL    McJellyfish c48218542918bec900a331a81e0a9d05    50d9a0e18a133a009ceb81924581284c    60.225.3.2:10000

2 个答案:

答案 0 :(得分:5)

with cte as (
 select row_number() over (
   partition by  
      [GameFileId]
    , [CreatedOn]
    , [EventTypeId]
    , [Message]
    , [Code]
    , [Violation]
    , [ClientName]
    , [ClientGuid]
    , [ClientGuidReversed]
    , [ClientIpAndPort]
 order by [LogEntryId]) as rn
 from LogEntries)
delete from cte
 where rn > 1;

答案 1 :(得分:0)

在没有任何其他信息的情况下,UNION ALL通常是躲避欺骗的好方法

select * from table
union 
select * from table

编辑在评论中反映拼写错误...