我有一张表跟踪在任意时间点发生的机器维护。这是一个简化的表结构:
Maintenance Table
----------------------------------------
ID - integer
DateCompleted - date
MachineName - varchar
这里是一些示例表数据:
ID DateCompleted MachineName
----------------------------------------
1 1/6/2011 'Machine 1'
2 1/13/2011 'Machine 2'
3 1/14/2011 'Machine 1'
4 2/2/2011 'Machine 3'
5 2/26/2011 'Machine 1'
6 3/9/2011 'Machine 2'
7 4/20/2011 'Machine 3'
我要做的是创建一个查询,该查询将返回每个任务的上一个维护任务的日期。所以结果集是这样的:
ID MachineName CurDate PrevDate
----------------------------------------
1 'Machine 1' 1/6/2011 NULL
2 'Machine 2' 1/13/2011 NULL
3 'Machine 1' 1/14/2011 1/6/2011
4 'Machine 3' 2/2/2011 NULL
5 'Machine 1' 2/26/2011 1/14/2011
6 'Machine 2' 3/9/2011 1/13/2011
7 'Machine 3' 4/20/2011 2/2/2011
编写此类查询的最佳方法是什么?到目前为止我唯一的想法是这样的:
SELECT ID, MachineName, DateCompleted AS CurDate,
(
SELECT TOP 1 DateCompleted FROM Maintenance m2
WHERE m1.MachineName = m2.MachineName
AND m1.DateCompleted > m2.DateCompleted
ORDER BY DateCompleted DESC
) AS PrevDate
FROM Maintenance m1
ORDER BY ID
非常欢迎任何想法,建议或更正。
答案 0 :(得分:1)
这个怎么样:
SELECT
m.ID, m.MachineName, m.DateCompleted AS CurDate, MAX(m_past.DateCompleted) AS PrevDate
FROM Maintenance m
LEFT JOIN Maintenance m_past
ON m.MachineName = m_past.MachineName
WHERE m_past.DateCompleted < m.DateCompleted
GROUP BY m.ID
答案 1 :(得分:1)
试试这个:
SELECT A.Id, A.MachineName, A.DateCompleted [CurDate], B.DateCompleted PrevDate
FROM Maintenance A
OUTER APPLY (SELECT TOP 1 *
FROM Maintenance
WHERE MachineName = A.MachineName AND DateCompleted < A.DateCompleted
ORDER BY DateCompleted DESC) B
答案 2 :(得分:1)
TOP n是否有效取决于你的dbms。 MAX()将跨平台工作。索引DateCompleted和MachineName;它们都在WHERE子句中使用。
select m1.id, m1.machinename, m1.datecompleted as curdate,
( select max(datecompleted)
from maintenance
where machinename = m1.machinename
and datecompleted < m1.datecompleted ) as prevdate
from maintenance m1
order by machinename, curdate
如果dbms支持窗口函数,则可以使用
select m1.id, m1.machinename, m1.datecompleted as curdate,
max(datecompleted) over (partition by machinename
order by m1.datecompleted
rows between unbounded preceding
and 1 preceding) as prevdate
from maintenance m1
我不会猜测哪个会更快。我将加载一个包含您期望的样本数据的表,并测试它们。然后我用10倍的数据重新加载它并再次测试。
在测试过程中,您想了解如何generate and read an execution plan。
答案 3 :(得分:1)
解决方案:
declare @tmp table (Id int, DateCompleted datetime, MachineName varchar(100))
insert into @tmp
select 1,'1/6/2011','Machine 1'
union select 2,'1/13/2011', 'Machine 2'
union select 3,'1/14/2011', 'Machine 1'
union select 4,'2/2/2011', 'Machine 3'
union select 5,'2/26/2011', 'Machine 1'
union select 6,'3/9/2011', 'Machine 2'
union select 7,'4/20/2011', 'Machine 3'
select t.Id, t.DateCompleted, t.MachineName, max(t2.DateCompleted) PrevDate
from @tmp t
left join @tmp t2
on t.MachineName = t2.MachineName
and t.DateCompleted > t2.DateCompleted
group by t.Id, t.DateCompleted, t.MachineName
答案 4 :(得分:1)
正如你所说“但我欢迎任何”的解决方案。
这是ANSI SQL的解决方案:
SELECT ID,
DateCompleted,
MachineName,
lag(DateCompleted) over (partition by MachineName order by DateCompleted) as PrevDate
FROM Maintenance
ORDER BY id;
适用于PostgreSQL,Oracle,DB2和Teradata。
SQL Server尚不支持lag()
功能,但即将推出的“Denali”版本(2012)将拥有它。
答案 5 :(得分:1)
从SQL Server 2012开始,您可以使用窗口化聚合来编写所需的查询。只需使用以下代码:
select
ID,
MachineName,
DateCompleted AS CurDate,
min(DateCompleted)
over (partition by MachineName order by DateCompleted
rows between 1 preceding and 1 preceding) as PrevDate
from Maintenance
order by Id
答案 6 :(得分:0)
您的查询对我来说似乎很合理,而且很容易理解。忽略最终排序的可能成本,我认为复杂性基本上是O(n log n),假设存在适当的索引。对于表中的每个条目,查询引擎必须找到上一个日期条目,该日期条目应为具有正确索引的O(log n)。
可能以代码复杂性为代价提高性能的一种方法是编写存储过程以产生结果。我认为无序结果可以在O(n)中产生。该过程可以遍历由MachineName
排序的表格上的两个游标,然后是DateCompleted
。当它遍历两个游标时,它可以在O(n)中构造结果集。但是,结果将需要按ID排序,这将是O(n log n)。所以我认为理论上的复杂性与查询相同,但过程可能会减少开销并且运行速度更快。但我肯定不会推荐这种解决方案,因为它会很难看并且难以维护。