我们有一个表,其中包含每个订阅者的人口统计问题(questionID)的链接,其中包含指示订阅者何时回答特定人口统计问题的日期。在某些情况下,订阅者可能会在以后再次回答相同的问题,现在我们有相同订阅者和questionID的多条记录,但答案日期不同(参见示例数据):
subscriberID questionID dateAnswered isDeleted
------------ ----------- ----------------------- ---------
100 559 2015-07-29 13:07:26.153 0
100 560 2015-07-29 13:07:26.153 0
100 561 2015-07-29 13:07:26.153 0
100 562 2015-07-29 13:07:26.153 0
100 575 2015-07-29 13:07:26.153 0
102 559 2015-07-30 15:12:46.143 0
102 564 2015-07-30 15:12:46.143 0
102 588 2015-07-30 15:12:46.143 0
102 559 2015-07-31 16:11:53.323 0
114 575 2015-08-21 11:27:14.253 0
114 588 2015-08-21 11:27:14.253 0
114 560 2015-08-21 11:27:14.253 0
114 588 2015-08-24 05:44:42.030 0
114 562 2015-08-21 11:27:14.253 0
114 575 2015-08-24 05:44:42.030 0
存储答案的应用应该将旧记录标记为"已删除" (设置isDeleted = 1)但它没有这样做,我现在需要清理旧记录。
这看起来应该很简单,但它让我难过。我如何(a)选择任何存在重复的subscriberID和questionID但具有不同答案日期的记录? (b)我如何进行更新,以便为每个订户设置除最新记录以外的所有记录为isDeleted = 1?
任何帮助将不胜感激!我怀疑自我加入可能是有序的,但我还没有想到它。这样的问题!
答案 0 :(得分:0)
;WITH X AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY
subscriberID, questionID
ORDER BY dateAnswered DESC) rn
, *
FROM TableName
)
UPDATE X
SET isDeleted = 1
WHERE rn > 1
答案 1 :(得分:0)
下面的选择/更新将影响所有未标记为已删除的记录,但每个用户对每个问题的最后一条记录除外。只是另一种方法。
WITH LastAnswers AS
(
SELECT subscriberID ,questionID , MAX(dateAnswered) AS LastAnsweredDate
FROM TableName
GROUP BY subscriberID ,questionID
)
UPDATE TableName
SET TableName.isDeleted = 1
FROM
TableName
LEFT JOIN LastAnswers
ON TableName.subscriberID = LastAnswers.subscriberID
AND TableName.questionID = LastAnswers.questionID
AND TableName.dateAnswered = LastAnswers.LastAnsweredDate
WHERE LastAnswers.LastAnsweredDate IS NULL AND TableName.isDeleted = 0