我有一个表,其中包含基于日期的用户操作。该表用作事件的时间轴。以下示例显示了两个人如何随时间改变其工作角色:
DECLARE @tbl TABLE (
UserID int,
ActionID int,
ActionDesc nvarchar(50),
ActionDate datetime
);
INSERT INTO @tbl (UserID, ActionID, ActionDesc, ActionDate)
VALUES
-- First person
(1, 200, 'Promoted', '2000-01-01'),
(1, 200, 'Promoted', '2001-01-01'),
(1, 200, 'Promoted', '2002-02-01'),
(1, 300, 'Moved', '2004-03-01'),
(1, 200, 'Promoted', '2005-03-01'),
(1, 200, 'Promoted', '2006-03-01'),
-- Second person
(2, 200, 'Promoted', '2006-01-01'),
(2, 300, 'Moved', '2007-01-01'),
(2, 200, 'Promoted', '2008-01-01');
SELECT * FROM @tbl ORDER BY UserID, ActionDate DESC;
这将显示以下内容,首先显示为最新事件:
我需要按相反的日期顺序显示表格,但是要根据[UserID / ActionID]匹配项,删除刚刚发生的所有事件。例如,如果此人被提升,然后在此之后立即又被提升,则第二次提升将不包括在结果中,因为它被认为是前一个动作的重复。
因此,所需的输出是:
经过研究,我尝试让ROW_NUMBER()
来识别重复项:
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY UserID, ActionID ORDER BY ActionDate ASC) AS RowNum
FROM
@tbl
ORDER BY
UserID, ActionDate DESC;
...但是它并不是很有效,因为在每次执行不同操作后都不会重置编号。我可能对此考虑不周,但是却在寻求灵感,因为搜索结果返回了无数问题,人们只是从列表中删除重复项。
答案 0 :(得分:4)
我将使用LEAD
消除不必要的行。
?pageNumber=0&pageSize=20
以上查询中的USE tempdb;
DECLARE @tbl TABLE (
UserID int,
ActionID int,
ActionDesc nvarchar(50),
ActionDate datetime
);
INSERT INTO @tbl (UserID, ActionID, ActionDesc, ActionDate)
VALUES
-- First person
(1, 200, 'Promoted', '2000-01-01'),
(1, 200, 'Promoted', '2001-01-01'),
(1, 200, 'Promoted', '2002-02-01'),
(1, 300, 'Moved', '2004-03-01'),
(1, 200, 'Promoted', '2005-03-01'),
(1, 200, 'Promoted', '2006-03-01'),
-- Second person
(2, 200, 'Promoted', '2006-01-01'),
(2, 300, 'Moved', '2007-01-01'),
(2, 200, 'Promoted', '2008-01-01');
;WITH src AS
(
SELECT *
, l = LEAD(t.ActionID) OVER (PARTITION BY t.UserID ORDER BY t.ActionDate DESC)
FROM @tbl t
)
SELECT src.UserID
, src.ActionID
, src.ActionDesc
, src.ActionDate
FROM src
WHERE src.l <> src.ActionID
OR src.l IS NULL
子句从输出中消除重复的行,其中前一行是当前行的重复ActionID。 WHERE
确保我们看到没有重复ActionID的行。
结果:
╔════════╦══════════╦════════════╦═════════════════════════╗ ║ UserID ║ ActionID ║ ActionDesc ║ ActionDate ║ ╠════════╬══════════╬════════════╬═════════════════════════╣ ║ 1 ║ 200 ║ Promoted ║ 2005-03-01 00:00:00.000 ║ ║ 1 ║ 300 ║ Moved ║ 2004-03-01 00:00:00.000 ║ ║ 1 ║ 200 ║ Promoted ║ 2000-01-01 00:00:00.000 ║ ║ 2 ║ 200 ║ Promoted ║ 2008-01-01 00:00:00.000 ║ ║ 2 ║ 300 ║ Moved ║ 2007-01-01 00:00:00.000 ║ ║ 2 ║ 200 ║ Promoted ║ 2006-01-01 00:00:00.000 ║ ╚════════╩══════════╩════════════╩═════════════════════════╝
对于具有大量行的表,您希望将查询中使用的聚合数量减少到最小; LEAD只需要一个汇总即可提供此功能。我的版本的执行计划:
答案 1 :(得分:2)
DECLARE @tbl TABLE (
UserID int,
ActionID int,
ActionDesc nvarchar(50),
ActionDate datetime
);
INSERT INTO @tbl (UserID, ActionID, ActionDesc, ActionDate)
VALUES
-- First person
(1, 200, 'Promoted', '2000-01-01'),
(1, 200, 'Promoted', '2001-01-01'),
(1, 200, 'Promoted', '2002-02-01'),
(1, 300, 'Moved', '2004-03-01'),
(1, 200, 'Promoted', '2005-03-01'),
(1, 200, 'Promoted', '2006-03-01'),
-- Second person
(2, 200, 'Promoted', '2006-01-01'),
(2, 300, 'Moved', '2007-01-01'), --<<--- here ActionID is 300
(2, 200, 'Promoted', '2008-01-01');
select UserID, ActionID, ActionDesc, min(ActionDate) as dt
from (
select t.*
, row_number() over(partition by UserID, ActionID order by ActionDate)
- row_number() over(partition by UserID order by ActionDate) as grp_id
from @tbl t
) v
group by grp_id, UserID, ActionID, ActionDesc
order by UserID, min(ActionDate) desc;
这将提供您的结果,但仅当ActionID
中的Moved
为300时,否则,应按ActionDesc
而不是ActionID
进行分区。
答案 2 :(得分:2)
SELECT * FROM
(SELECT *, ROW_NUMBER() over (partition by Q2.userid, Q2.ActionId, rn2 order by Q2.actiondate) rn3 FROM
(select *, Q1.rn - ROW_NUMBER() over (partition by Q1.userid, Q1.actionid order by Q1.actiondate) rn2 from
(SELECT *,ROW_NUMBER() over (order by userid, actiondate) rn from @tbl) Q1
) Q2
)
Q3 Where q3.rn3 = 1 ORDER BY Q3.UserID,Q3.ActionDate
第一个(内部)查询为每行分配一个row_number,并按userid和actiondate进行排序-然后我计算出与之相同的row_number,但也对“ action”进行了分区-如果我从A减去B,我得到一个数字只能应用于一组用户ID和操作-通过设置另一个row_number,按userid,actionId和我的rown_number进行分区并按日期排序,然后我可以选择最早的日期作为第1行。