T-SQL删除“重复/非感兴趣”数据行

时间:2014-01-23 07:49:44

标签: sql-server tsql sql-server-2012

我有以下数据集(给出样本)

ID         Status  Code   Type       ModDate
1234       1       1      AB         1995-04-01
1234       1       1      CD         1998-08-31
1234       1       1      AB         2003-08-31
1234       1       NULL   AB         2008-11-08
1234       1       2      AB         2013-11-09
1234       1       1      EF         2013-11-18
...

由于必须在某种时间轴上查看这些数据,我想从数据库中只读取以下内容,因为只有Type更改才有意义:

ID         Status  Code   Type       ModDate
1234       1       1      AB         1995-04-01
1234       1       1      CD         1998-08-31
1234       1       1      AB         2003-08-31
1234       1       1      EF         2013-11-18
...

如何做到这一点?我试图对数据进行分区并给出一些行号,但是由于Type被分组,这让我很头疼。

SELECT
    ID, Status, Code, Type, ModDate,
    MIN(ModDate) OVER (PARTITION BY ID, Type) MinModDate,
    MAX(ModDate) OVER (PARTITION BY ID, Type) MaxModDate,
    ROW_NUMBER() OVER (PARTITION BY ID, Type ORDER BY ModDate) RowNumber
FROM Data

输出:

ID         Status  Code   Type     ModDate       MinModDate    MaxModDate   RowNumber
1234       1       1      AB       1995-04-01    1995-04-01    2013-11-09   1
1234       1       1      CD       1998-08-31    1998-08-31    1998-08-31   1
1234       1       1      AB       2003-08-31    1995-04-01    2013-11-09   2
1234       1       NULL   AB       2008-11-08    1995-04-01    2013-11-09   3
1234       1       2      AB       2013-11-09    1995-04-01    2013-11-09   4
1234       1       1      EF       2013-11-18    2013-11-18    2013-11-18   1
...

预期输出:

ID         Status  Code   Type     ModDate       MinModDate    MaxModDate   RowNumber
1234       1       1      AB       1995-04-01    1995-04-01    2013-11-09   1
1234       1       1      CD       1998-08-31    1998-08-31    1998-08-31   1
1234       1       1      AB       2003-08-31    1995-04-01    2013-11-09   1
1234       1       NULL   AB       2008-11-08    1995-04-01    2013-11-09   2
1234       1       2      AB       2013-11-09    1995-04-01    2013-11-09   3
1234       1       1      EF       2013-11-18    2013-11-18    2013-11-18   1
...

这可以在不使用游标的情况下轻松实现吗?

3 个答案:

答案 0 :(得分:1)

对数据进行分区是您想要的,您只需要按类型进行分区,因为这是唯一感兴趣的变化。您还需要添加ROW_NUMBER()函数以过滤所需的行。这是一个更新的查询。

;WITH cte AS
(
    SELECT  ID, [Status], Code, [Type], ModDate
            ,rn = ROW_NUMBER() OVER (PARTITION BY ModDate ORDER BY ModDate)
    FROM    #data
)
SELECT  ID, [Status], Code, [Type], ModDate
FROM    cte
WHERE   rn = 1
ORDER   BY ModDate, [Type]

答案 1 :(得分:1)

因为您使用2012,所以这应该有效:

SELECT ID, Status, Code, Type, ModDate FROM 
(
SELECT
    ID, Status, Code, Type, ModDate,
    lag(type,1) OVER (ORDER BY ID, moddate) prevtype
FROM data
)t WHERE type<>ISNULL(prevtype,'')

答案 2 :(得分:0)

如果我理解正确,您只需要包装原始SQL:

SELECT ID, Status, Code, Type, ModDate FROM 
(
SELECT
    ID, Status, Code, Type, ModDate,
    MIN(ModDate) OVER (PARTITION BY ID, Type) MinModDate,
    MAX(ModDate) OVER (PARTITION BY ID, Type) MaxModDate,
    ROW_NUMBER() OVER (PARTITION BY ID, Type ORDER BY ModDate) RowNumber
FROM Data
) t
WHERE ModDate=MinModDate