使用自连接根据列值从表中删除行

时间:2018-06-19 19:31:12

标签: sql sql-server

我有一个要求,我的数据如下所示。我必须从未“删除”最后一个pid状态的表中找到ID。

注意:- 1.要获取最后的pid状态,请使用“日期”和“小时”列。 2.如果对于“ id”而言,pid的最后一个“ status”值被删除,则不要在结果中包括该行。

id  |   key    |    date     |  hour    | pid  | status
--------------------------------------------------------
id1 |   one    |    20180618 |  2       |  p1  | added
id1 |   one    |    20180618 |  3       |  p1  | removed
id1 |   one    |    20180618 |  4       |  p1  | added
id1 |   one    |    20180618 |  4       |  p2  | added

id1 |   one    |    20180619 |  2       |  p1  | removed
id1 |   one    |    20180619 |  4       |  p1  | added
id1 |   one    |    20180619 |  4       |  p2  | removed
id1 |   one    |    20180619 |  5       |  p3  | added

id2 |   one    |    20180619 |  5       |  p1  | added
id2 |   one    |    20180619 |  5       |  p2  | added
id2 |   one    |    20180619 |  6       |  p1  | removed

预期输出:-

id  |   key    |    date     |  hour    | pid  | status
--------------------------------------------------------
id1 |   one    |    20180619 |  4       |  p1  | added
id1 |   one    |    20180619 |  5       |  p3  | added
id2 |   one    |    20180619 |  5       |  p2  | added

我不想从源表中删除数据。 我想使用自连接查询源表以产生以上结果。

3 个答案:

答案 0 :(得分:1)

last_value窗口函数可让您无需加入即可执行此操作:

PDO

答案 1 :(得分:1)

使用row_number()函数为idpid的每种组合标识最新记录,然后很容易地仅选择具有所需状态的记录,例如:

declare @SampleData table (id varchar(32), [key] varchar(32), [date] date, [hour] int, pid varchar(32), [status] varchar(32));
insert @SampleData values
    ('id1', 'one', '20180618', 2, 'p1', 'added'),
    ('id1', 'one', '20180618', 3, 'p1', 'removed'),
    ('id1', 'one', '20180618', 4, 'p1', 'added'),
    ('id1', 'one', '20180618', 4, 'p2', 'added'),
    ('id1', 'one', '20180619', 2, 'p1', 'removed'),
    ('id1', 'one', '20180619', 4, 'p1', 'added'),
    ('id1', 'one', '20180619', 4, 'p2', 'removed'),
    ('id1', 'one', '20180619', 5, 'p3', 'added'),
    ('id2', 'one', '20180619', 5, 'p1', 'added'),
    ('id2', 'one', '20180619', 5, 'p2', 'added'),
    ('id2', 'one', '20180619', 6, 'p1', 'removed');

with OrderedDataCTE as
(
    select
        S.id, S.[key], S.[date], S.[hour], S.pid, S.[status],
        [sequence] = row_number() over (partition by S.id, S.pid order by S.[date] desc, S.[hour] desc)
    from
        @SampleData S
)
select
    O.id, O.[key], O.[date], O.[hour], O.pid, O.[status]
from
    OrderedDataCTE O
where
    O.[sequence] = 1 and
    O.[status] != 'removed';

答案 2 :(得分:1)

因为您要使用自连接的解决方案。

以下是使用自联接的解决方案:

SELECT t.*
FROM YourTable t
LEFT JOIN YourTable r
ON ( r.id = t.id AND r.pid = t.pid AND r.[status] = 'removed'
     AND dateadd(hour,r.hour,cast(r.date AS datetime)) >= dateadd(hour,t.hour,cast(t.date as datetime))
)
WHERE r.[status] IS NULL
ORDER BY t.id, t.pid, t.date, t.hour;

但是我更喜欢NOT EXISTS版本

SELECT *
FROM YourTable t
WHERE NOT EXISTS
(
    SELECT 1 
    FROM YourTable r
    WHERE r.id = t.id AND r.pid = t.pid AND r.[status] = 'removed'
     AND dateadd(hour,r.hour,cast(r.date AS datetime)) >= dateadd(hour,t.hour,cast(t.date as datetime))
)
ORDER BY t.id, t.pid, t.date, t.hour;

两者的回报:

id  key date       hour pid status
--- --- ---------- ---- --- ------
id1 one 2018-06-19    4 p1  added
id1 one 2018-06-19    5 p3  added
id2 one 2018-06-19    5 p2  added