请你帮我写一个查询来删除重复项。见下文 查看结果我添加了一个列作为我的状态(手动)添加。类别是确定记录是否重复的类别。在这种情况下,我们的主要重点是取消。如果对于会员,取消会员在会员Y007之后恢复。不被视为重复。但是如果一个成员有多个取消,那么被认为是重复的,因为如果我们要计算一些重复数,那么它们都将被计算,这将得到不正确的结果。我们需要只计算一次成员。取消可以由用户或用户1完成,并且可以让user1完成多次取消。 你可以帮我写一个确保没有重复的查询。该成员只有一条记录而不是2条记录都是重复记录
CreateYear MonthDay Category Member Status
2014 July 1 Cancellation by User Y0007
2014 July 1 Reinstatement by User Y0007 not duplicate
2014 July 2 Cancellation by User Y0007
2014 July 2 Reinstatement by User Y0007
2014 July 1 Cancellation by User O0031 not duplicate
2014 July 8 Reinstatement by User O0031
2014 July 1 Cancellation by User O0135 not duplicate
2014 July 8 Reinstatement by User O0135
2014 July 3 Cancellation by User P0422 duplicate
2014 July 4 Cancellation by User2 P0422
2014 July 4 Cancellation by User E3488 not duplicate
2014 July 8 Reinstatement by User E3488
答案 0 :(得分:0)
你可以这样写: -
DELETE FROM TABLE_NAME
WHERE member NOT IN (SELECT MAX(member) FROM TABLE_NAME
GROUP BY CreateYear, MonthDay, Category, Member, Status);
我希望这能解决你的问题。
答案 1 :(得分:0)
;with TempCte as (Select CreateYear,MonthDay,Category,Member, MemberCount =ROW_NUMBER()
over(PARTITION By CreateYear, MonthDay, Category, Member, Status Order By CreateYear)
From TableName)
Delete TempCte
Where MemberCount >1
答案 2 :(得分:0)
<强>策略强>
为了删除重复项,让我们首先编写一些只选择重复项的代码。如果您可以编写正确的select语句,则很容易将其转换为delete语句。现在,select语句有点棘手,因为我们需要比较日期,但是您的日期被分成三列,而月份不会存储为数字。但是,如果我们将三个列连接在一起并将它们作为日期投射,我们可以比较日期值并查找连续多次发生的取消。
<强>查询强>
Select *
From testtable t
Where category = 'Cancellation'
and exists (Select 1
From testtable t2
Where t2.category = t.category
and t2.Member = t.Member
and Cast(t2.CreateMonth + ' '+ cast(t2.CreateDay as varchar(2)) + ' ' + Cast(t2.CreateYear as varchar(4)) as date) >
Cast(t.CreateMonth + ' '+ cast(t.CreateDay as varchar(2)) + ' ' + Cast(t.CreateYear as varchar(4)) as date)
and not exists (Select 1
From testtable t3
Where t3.category = 'Reinstatement'
and t3.Member = t.Member
and Cast(t3.CreateMonth + ' '+ cast(t3.CreateDay as varchar(2)) + ' ' + Cast(t3.CreateYear as varchar(4)) as date) >=
Cast(t.CreateMonth + ' '+ cast(t.CreateDay as varchar(2)) + ' ' + Cast(t.CreateYear as varchar(4)) as date)
and Cast(t2.CreateMonth + ' '+ cast(t2.CreateDay as varchar(2)) + ' ' + Cast(t2.CreateYear as varchar(4)) as date) >=
Cast(t3.CreateMonth + ' '+ cast(t3.CreateDay as varchar(2)) + ' ' + Cast(t3.CreateYear as varchar(4)) as date)
)
)
<强>释强>
这个查询看起来有点激烈,但让我们来谈谈它。首先,我们可能已经习惯了所有这些select * From testtable t
,除了别名。我已将此testtable Where category = 'Cancellation'
t的特定实例命名为。我基本上给它起了昵称t,以便不是通过名称引用表(并且因为我使用同一个表三次而感到困惑),我可以将其称为t。
现在,exists
。这可能与您看到的语法不同,也可能不同。如果括号内的某些内容存在,则存在返回true,如果这些括号中没有任何内容,则返回false。我正在使用此存在来检查测试表中是否有一个比我们正在查看的取消更新的记录,并且也是取消。
这是日期的来源。我们只想知道是否还有其他取消,其中月份和日期大于我们正在查看的取消。对于会员P0422,当我们确定7月3日的取消是否重复时,我们希望在7月4日找到取消。
在子选择中我正在使用not exists
。我们需要检查的最后一件事是两次取消之间是否存在恢复。如果有,我们想忽略取消并继续检查下一行。 See the query in action
投放日期深入了解
我在此查询中多次使用Cast(t2.CreateMonth + ' '+ cast(t2.CreateDay as varchar(2)) + ' ' + Cast(t2.CreateYear as varchar(4)) as date)
这样的语法来确定记录的日期。我这样做是因为取消(或恢复)的日期已分为三列。
让我们来看看以下记录的内容。
CreateYear MonthDay Category Member Status 2014 July 1 Cancellation by User Y0007
首先是连接。 t2.CreateMonth + ' '+ cast(t2.CreateDay as varchar(2)) + ' ' + Cast(t2.CreateYear as varchar(4))
这部分从每个列中抓取不同的信息并将它们粘在一起。年份和日期是整数而不是varchars,所以首先我将它们切换为varchars。此行的结果为July 1 2014
。
然后,我们Cast (... as date)
。通过转换,您可以获取一条信息并将其转换为不同的数据类型。所以,这告诉sql查看July 1 2014
,好像它是一个日期而不是一个字符串。所有这一切都已完成,因此我们可以比较日期。 Sql知道如何判断哪个日期比另一个更新,这就是我转换这些值的原因。而不是这样做,你可以分别比较每一段日期,但无论哪种方式仍然很多。
让我们删除!
现在我们发现了所有重复的行,我们可以更改查询以便轻松删除它们。
Delete t
From testtable t
Where category = 'Cancellation'
and exists (Select 1
From testtable t2
Where t2.category = t.category
and t2.Member = t.Member
and Cast(t2.CreateMonth + ' '+ cast(t2.CreateDay as varchar(2)) + ' ' + Cast(t2.CreateYear as varchar(4)) as date) >
Cast(t.CreateMonth + ' '+ cast(t.CreateDay as varchar(2)) + ' ' + Cast(t.CreateYear as varchar(4)) as date)
and not exists (Select 1
From testtable t3
Where t3.category = 'Reinstatement'
and t3.Member = t.Member
and Cast(t3.CreateMonth + ' '+ cast(t3.CreateDay as varchar(2)) + ' ' + Cast(t3.CreateYear as varchar(4)) as date) >=
Cast(t.CreateMonth + ' '+ cast(t.CreateDay as varchar(2)) + ' ' + Cast(t.CreateYear as varchar(4)) as date)
and Cast(t2.CreateMonth + ' '+ cast(t2.CreateDay as varchar(2)) + ' ' + Cast(t2.CreateYear as varchar(4)) as date) >=
Cast(t3.CreateMonth + ' '+ cast(t3.CreateDay as varchar(2)) + ' ' + Cast(t3.CreateYear as varchar(4)) as date)
)
)
答案 3 :(得分:-1)
你可以使用分组
Select Create,Year,MonthDay,Category , Member,Status
From tblname
Group By Category