选择基于特定行的简单规则

时间:2018-10-19 14:44:24

标签: sql-server sql-server-2012

我有如下示例数据

Id  UserId  EventDate   EventId EventTime               Fg  RN
1   1       2018-10-15  6       2018-10-15 12:10:10.000 0   8
2   1       2018-10-15  6       2018-10-15 12:10:11.000 0   7
3   1       2018-10-15  6       2018-10-15 12:10:12.000 1   6
4   1       2018-10-15  6       2018-10-15 12:10:13.000 1   5
5   1       2018-10-15  6       2018-10-15 12:10:15.000 0   4
6   1       2018-10-15  6       2018-10-15 12:10:17.000 0   3
7   1       2018-10-15  6       2018-10-15 12:10:20.000 1   2
8   1       2018-10-15  6       2018-10-15 12:10:25.000 1   1
9   1       2018-10-16  8       2018-10-16 12:12:33.000 0   3
10  1       2018-10-16  8       2018-10-16 12:12:43.000 0   2
11  1       2018-10-16  8       2018-10-16 12:12:47.000 1   1
12  1       2018-10-17  9       2018-10-17 12:15:10.000 0   4
13  1       2018-10-17  9       2018-10-17 12:15:15.000 0   3
14  1       2018-10-17  9       2018-10-17 12:15:18.000 1   2
15  1       2018-10-17  9       2018-10-17 12:15:25.000 1   1

我要选择以下行

Id  UserId  EventDate   EventId EventTime               Fg  RN
7   1       2018-10-15  6       2018-10-15 12:10:20.000 1   2
11  1       2018-10-16  8       2018-10-16 12:12:47.000 1   1
14  1       2018-10-17  9       2018-10-17 12:15:18.000 1   2

这些行由第一Fg列后面的下一行(从末尾开始为0)标识。 RN列为1标识结束。

为进一步说明,RN列是从末尾开始的每一天每个UserId的事件顺序。

我正在使用的解决方案是向后遍历递归CTE,但这对于我拥有的数据量来说相当慢。

有哪些替代方法?

这是DDL

CREATE TABLE dbo.test (Id INT IDENTITY(1,1), UserId INT, EventDate DATE, EventId INT, EventTime DATETIME NOT NULL, Fg BIT, RN BIGINT)
GO

INSERT INTO dbo.Test(UserId, EventDate, EventTime, Fg, RN, EventId ) 
VALUES
     (1, '20181015','20181015 12:10:10', 0, 8, 6)
    ,(1, '20181015','20181015 12:10:11', 0, 7, 6)
    ,(1, '20181015','20181015 12:10:12', 1, 6, 6)
    ,(1, '20181015','20181015 12:10:13', 1, 5, 6)

    ,(1, '20181015','20181015 12:10:15', 0, 4, 6)
    ,(1, '20181015','20181015 12:10:17', 0, 3, 6)
    ,(1, '20181015','20181015 12:10:20', 1, 2, 6)
    ,(1, '20181015','20181015 12:10:25', 1, 1, 6)

    ,(1, '20181016','20181016 12:12:33', 0, 3, 8)
    ,(1, '20181016','20181016 12:12:43', 0, 2, 8)
    ,(1, '20181016','20181016 12:12:47', 1, 1, 8)

    ,(1, '20181017','20181017 12:15:10', 0, 4, 9)
    ,(1, '20181017','20181017 12:15:15', 0, 3, 9)
    ,(1, '20181017','20181017 12:15:18', 1, 2, 9)
    ,(1, '20181017','20181017 12:15:25', 1, 1, 9)
GO

1 个答案:

答案 0 :(得分:3)

您可以使用窗口功能在此处帮助标识正确的记录:

SELECT *
FROM
    (
        SELECT *, ROW_NUMBER() OVER (PARTITION BY EventID, LagCheck ORDER BY RN) lagCheckRow
        FROM
            (
                SELECT test.*, lag(Fg) OVER (PARTITION BY EventID ORDER BY RN DESC) as lagcheck
                FROM test
            ) t1
    ) t2
WHERE lagCheckRow = 1 AND lagCheck = 0;


+----+--------+---------------------+---------+---------------------+------+----+----------+-------------+
| Id | UserId |      EventDate      | EventId |      EventTime      |  Fg  | RN | lagcheck | lagCheckRow |
+----+--------+---------------------+---------+---------------------+------+----+----------+-------------+
|  7 |      1 | 15.10.2018 00:00:00 |       6 | 15.10.2018 12:10:20 | True |  2 | 0        |           1 |
| 11 |      1 | 16.10.2018 00:00:00 |       8 | 16.10.2018 12:12:47 | True |  1 | 0        |           1 |
| 14 |      1 | 17.10.2018 00:00:00 |       9 | 17.10.2018 12:15:18 | True |  2 | 0        |           1 |
+----+--------+---------------------+---------+---------------------+------+----+----------+-------------+

Rextester.com example

最里面的查询使用Lag()获取上一条记录的Fg。我们在这里需要一个0(在最外面的查询中过滤)。然后,我们击中Row_Number(),以从lag()结果中获得订单。我们保留那个顺序= 1的那个。

这应该比递归cte快得多,因为它只需对数据(初始结果集,窗口函数运行,下一个窗口函数运行,最终WHERE谓词)进行大约4次滑动即可。

快速说明:如果此表有多个userid(肯定有),并且要针对多个表运行,则将userid添加到每个PARTITION BY中确保您不会得到古怪的答案。