如何有效地选择小于和大于给定值的最近值?

时间:2012-01-20 07:48:47

标签: sql sql-server optimization query-optimization sql-server-2008-r2

我有两个表,一个是值,一个是位置,我正在尝试插入位置。表格已简化为以下内容:

CREATE TABLE value(
    Timestamp DATETIME2,
    Value float NOT NULL,
    PRIMARY KEY(Timestamp)
);

CREATE TABLE location(
    Timestamp DATETIME2,
    Position INT NOT NULL,
    PRIMARY KEY(Timestamp)
); 

INSERT INTO value VALUES 
    ('2011/12/1 16:55:01', 1),
    ('2011/12/1 16:55:02', 5),
    ('2011/12/1 16:55:05', 10),
    ('2011/12/1 16:55:08', 6);

INSERT INTO location VALUES 
    ('2011/12/1 16:55:00', 0),
    ('2011/12/1 16:55:05', 10),
    ('2011/12/1 16:55:10', 5)

预期结果将是

TimeStamp, Value, LowerTime, LowerLocation, UpperTime, UpperLocation
2011-12-01 16:55:01,  1, 2011-12-01 16:55:00,  0, 2011-12-01 16:55:05, 10
2011-12-01 16:55:02,  5, 2011-12-01 16:55:00,  0, 2011-12-01 16:55:05, 10
2011-12-01 16:55:05, 10, 2011-12-01 16:55:05, 10, 2011-12-01 16:55:05, 10
2011-12-01 16:55:08,  6, 2011-12-01 16:55:05, 10, 2011-12-01 16:55:10,  5

(请记住,这是简化的示例数据,以便了解我正在尝试执行的查询。)

要进行插值,我需要弄清楚给定值时间之前和之后的时间和位置。我目前正在使用如下查询执行此操作:

SELECT 
    V.Timestamp, 
    V.Value, 
    (SELECT MAX(Timestamp) FROM dbo.location WHERE Timestamp <= V.Timestamp) as LowerTime,
    (SELECT TOP 1 Position FROM dbo.location WHERE Timestamp <= V.Timestamp ORDER BY timestamp DESC) as LowerLocation,
    (SELECT MIN(Timestamp) FROM dbo.location WHERE Timestamp >= V.Timestamp) as UpperTime,
    (SELECT TOP 1 Position FROM dbo.location WHERE Timestamp >= V.Timestamp ORDER BY timestamp ASC) as UpperLocation
 FROM 
    dbo.value V 

现在这有效,但这显然做了很多工作。我认为必须有一个我错过的查询简化,但我整个上午一直在玩它并且没有提出任何具体的东西。希望有人在这里有更好的主意。

我目前正在探索是否有办法找出LowerTime和UpperTime并使用它们来确定位置。类似的东西:

SELECT 
    V.Timestamp, 
    V.Value, 
    (SELECT MAX(Timestamp) FROM dbo.location WHERE Timestamp <= V.Timestamp) as LowerTime,
    (SELECT Position FROM dbo.location WHERE Timestamp = LowerTime) as LowerLocation,
    (SELECT MIN(Timestamp) FROM dbo.location WHERE Timestamp >= V.Timestamp) as UpperTime,
    (SELECT Position FROM dbo.location WHERE Timestamp = UpperTime) as UpperLocation
 FROM 
    dbo.value V 

但这不起作用。

EDIT1:按建议更新了查询。但是执行时间没有明显变化。

EDIT2:添加了我对目前正在尝试的方法的看法。

2 个答案:

答案 0 :(得分:11)

为简单起见,您至少可以使用MAX()MIN()函数来查询timestamp字段,而不是TOP 1ORDER BY

完整查询将

SELECT 
    V.Timestamp, 
    V.Value, 
    (SELECT MAX(Timestamp) FROM dbo.location WHERE Timestamp <= V.Timestamp) as LowerTime,
    (SELECT TOP 1 Position FROM dbo.location WHERE Timestamp <= V.Timestamp ORDER BY timestamp DESC) as LowerLocation,
    (SELECT MIN(Timestamp) FROM dbo.location WHERE Timestamp >= V.Timestamp) as UpperTime,
    (SELECT TOP 1 Position FROM dbo.location WHERE Timestamp >= V.Timestamp ORDER BY timestamp ASC) as UpperLocation
 FROM 
    dbo.value V 

答案 1 :(得分:0)

这可能会起作用(虽然我认为连接看起来很丑陋):

;with OrderedLocations as (
    select
        v.Timestamp,
        v.Value,
        l.Timestamp as tsl,
        l.Position,
        ROW_NUMBER() OVER (PARTITION BY v.Timestamp ORDER BY CASE WHEN l.Timestamp <= v.Timestamp THEN l.Timestamp ELSE '00010101' END desc) as PrevRN,
        ROW_NUMBER() OVER (PARTITION BY v.Timestamp ORDER BY CASE WHEN l.Timestamp >= v.Timestamp THEN l.Timestamp ELSE '99991231' END asc) as NextRN
    from
        value v
            cross join
        location l
)
select
    ol1.Timestamp,
    ol1.Value,
    ol1.tsl,
    ol1.Position,
    ol2.tsl,
    ol2.Position
from
    OrderedLocations ol1
        inner join
    OrderedLocations ol2
        on
            ol1.Timestamp = ol2.Timestamp and
            ol1.Value = ol2.Value
where
    ol1.PrevRN = 1 and
    ol2.NextRN = 1

不幸的是,与大多数效率/性能问题一样,答案往往是尝试与实际表格和数据进行大量不同的组合,并衡量每个表格和数据的执行方式。


使用与上述相同的CTE的替代方案(避免连接)将是:

SELECT Timestamp,Value,
    MAX(CASE WHEN PrevRN=1 THEN tsl END),MAX(CASE WHEN PrevRN=1 then Position END),
    MAX(CASE WHEN NextRN=1 THEN tsl END),MAX(CASE WHEN NextRN=1 then Position END)
FROM
    OrderedLocations
where PrevRN=1 or NextRN=1
group by Timestamp,Value

CTE(OrderedLocations)正在尝试构建一个行集,其中来自位置的每一行都与value中的每一行匹配。对于每个结果行,我们计算两个ROW_NUMBER s - 我们按降序排列具有较低或相等时间戳(PrevRN)的所有行的行号,以及我们对所有行进行编号的另一行或等时间戳(NextRN)升序。然后,我们通过考虑其中一个行号为1的行来构造我们的最终结果。