确定。如果这个问题已被涵盖,首先让我道歉。我看了看,但没有一个解决方案解决了我的问题的细节。
随着时间的推移,我有一个超过1.6亿行数据跟踪员工/服务器状况的表。我想创建这个数据的子集并删除整个过程中发生的重复,但是当它们发生时保持变化的顺序。大多数员工的减少量将从700行(并且不断增长)增加到1.
以下是我想要了解的简化示例:
Given:
RowID Employee Server Timestamp
----- -------- ------ ---------
5 E000001 Serv-B May01
4 E000001 Serv-A Apr01
3 E000001 Serv-B Mar01
2 E000001 Serv-A Feb01
1 E000001 Serv-A Jan01
Doing a "Min(Timestamp) Group By Employee, Server" would yield:
Employee Server Timestamp
-------- ------ ---------
E000001 Serv-B Mar01
E000001 Serv-A Jan01
.
What I need is:
Employee Server Timestamp
-------- ------ ---------
E000001 Serv-B May01
E000001 Serv-A Apr01
E000001 Serv-B Mar01
E000001 Serv-A Jan01
表格和提供它的过程不属于我们的小组,所以我不能影响那里的解决方案,我宁愿不被困在整个事物的副本。考虑到表的大小,我无法实际执行游标/ RBAR方法。如果支持角落,我可以编写一个应用程序来执行此操作,但我想知道SQoLympus中的任何神在存储过程中是否有任何智慧。提前谢谢!
编辑:这是SQL Server 2008 - 很抱歉没有提及它。
答案 0 :(得分:1)
如果是SQL Server(假设我已正确理解您的要求)
/*Set up test table*/
DECLARE @T TABLE (
RowID INT,
Employee CHAR(7),
[Server] CHAR(6),
[timestamp] DATETIME );
INSERT INTO @T
SELECT 5,'E000001','Serv-B', '20010501' UNION ALL
SELECT 4,'E000001','Serv-A', '20010401' UNION ALL
SELECT 3,'E000001','Serv-B', '20010301' UNION ALL
SELECT 2,'E000001','Serv-A', '20010201' UNION ALL
SELECT 1,'E000001','Serv-A', '20010101';
WITH cte
As (SELECT ROW_NUMBER() OVER (PARTITION BY Employee ORDER BY RowID) -
ROW_NUMBER() OVER (PARTITION BY Employee, Server
ORDER BY RowID) AS Grp,
*
FROM @T),
cte2
AS (SELECT *,
ROW_NUMBER() OVER (PARTITION BY Employee, Grp ORDER BY RowID) AS
Rn
FROM cte)
/* Edit: Actually - You want a SELECT not a DELETE I think?
DELETE FROM cte2 WHERE Rn > 1*/
SELECT RowID, Employee, [Server], [timestamp]
FROM cte2
WHERE Rn = 1
答案 1 :(得分:0)
您没有说出您正在使用的数据库,但如果这是Oracle,您可以使用lag
或lead
分析函数来引用上一行或下一行。
select employee, server, timestamp
from
(select employee, server, timestamp,
lag(employee) over (order by employee, server, timestamp) prev_employee
lag(server) over (order by employee, server, timestamp) prev_server
from table
)
where not (employee = prev_employee and server = prev_server)