如何更新旧版"复制"记录(除日期列外重复)

时间:2015-08-28 18:38:14

标签: sql-server tsql duplicate-data

我们有一个表,其中包含每个订阅者的人口统计问题(questionID)的链接,其中包含指示订阅者何时回答特定人口统计问题的日期。在某些情况下,订阅者可能会在以后再次回答相同的问题,现在我们有相同订阅者和questionID的多条记录,但答案日期不同(参见示例数据):

subscriberID questionID  dateAnswered            isDeleted
------------ ----------- ----------------------- ---------
100          559         2015-07-29 13:07:26.153 0
100          560         2015-07-29 13:07:26.153 0
100          561         2015-07-29 13:07:26.153 0
100          562         2015-07-29 13:07:26.153 0
100          575         2015-07-29 13:07:26.153 0
102          559         2015-07-30 15:12:46.143 0
102          564         2015-07-30 15:12:46.143 0
102          588         2015-07-30 15:12:46.143 0
102          559         2015-07-31 16:11:53.323 0
114          575         2015-08-21 11:27:14.253 0
114          588         2015-08-21 11:27:14.253 0
114          560         2015-08-21 11:27:14.253 0
114          588         2015-08-24 05:44:42.030 0
114          562         2015-08-21 11:27:14.253 0
114          575         2015-08-24 05:44:42.030 0

存储答案的应用应该将旧记录标记为"已删除" (设置isDeleted = 1)但它没有这样做,我现在需要清理旧记录。

这看起来应该很简单,但它让我难过。我如何(a)选择任何存在重复的subscriberID和questionID但具有不同答案日期的记录? (b)我如何进行更新,以便为每个订户设置除最新记录以外的所有记录为isDeleted = 1?

任何帮助将不胜感激!我怀疑自我加入可能是有序的,但我还没有想到它。这样的问题!

2 个答案:

答案 0 :(得分:0)

;WITH X AS 
 ( 
  SELECT ROW_NUMBER() OVER (PARTITION BY 
                            subscriberID, questionID  
                            ORDER BY dateAnswered DESC) rn 
  , * 
  FROM TableName 
 )
UPDATE X 
 SET isDeleted = 1
WHERE rn > 1

答案 1 :(得分:0)

下面的选择/更新将影响所有未标记为已删除的记录,但每个用户对每个问题的最后一条记录除外。只是另一种方法。

   WITH LastAnswers AS
    (
    SELECT    subscriberID ,questionID , MAX(dateAnswered) AS LastAnsweredDate
    FROM      TableName 
    GROUP BY  subscriberID ,questionID
    )

UPDATE TableName
   SET TableName.isDeleted = 1
FROM 
  TableName

LEFT JOIN LastAnswers 
ON  TableName.subscriberID = LastAnswers.subscriberID 
AND TableName.questionID   = LastAnswers.questionID 
AND TableName.dateAnswered = LastAnswers.LastAnsweredDate

WHERE  LastAnswers.LastAnsweredDate IS NULL AND TableName.isDeleted = 0