使用一系列值查找时间轴 - 实体建模

时间:2009-07-06 05:40:53

标签: sql performance optimization

假设我有两个实体:事件和活动

事件是在(看似)随机时间发生的事情,如日出,日落,风暴,雾等。

我有一张表:

create table Event (
eventKey int,
eventDesc varchar(100),
started datetime
)

 EventKey  | EventDesc  | Started
 1           "Sunset"     2009-07-03 6:51pm 
 2           "Sunrise"    2009-07-04 5:33am
 3           "Fog"        2009-07-04 5:52pm
 4           "Sunset"     2009-07-04 6:49pm
 5           "Full Moon"  2009-07-04 10:12pm
 6           "Sunrise"    2009-07-05 5:34am

然后我有一张人们参与的活动表,以及他们与之相关的活动(即一个行动可以长时间运行并跨越多个事件:“周末露营”):

create table EventTask (
activityKey int,
activityDesc varchar(100),
startEventKey int,
endEventKey int
)

ActivityKey  |  ActivityDesc | StartEventKey | EndEventKey
123             "Camp-out"     1               5
234             "Drive home"   6               6

我想输出由发生的事件标记的操作的时间轴:

ActivityKey  |  ActivityDesc | EventKey  | EventDesc
123             "Camp-out"     1           "Sunset"
123             "Camp-out"     2           "Sunrise"
123             "Camp-out"     3           "Fog"
123             "Camp-out"     4           "Sunset"
123             "Camp-out"     5           "Full Moon"
234             "Drive Home"   6           "Sunrise"

是否可以编写一个以线性时间similar to this question执行此操作的查询?还请推荐索引或您能想到的任何其他优化。当前的解决方案是用C#编写的,但我会喜欢快速的SQL解决方案。

执行此操作的最佳查询是什么?

3 个答案:

答案 0 :(得分:2)

/*
create table Event (
eventKey int,
eventDesc varchar(100),
started timestamp
);

 insert into event values( 1,           'Sunset' ,    '2009-07-03 6:51pm');
 insert into event values(2,           'Sunrise',    '2009-07-04 5:33am');
 insert into event values(3,           'Fog'     ,   '2009-07-04 5:52pm');
 insert into event values(4,           'Sunset'   ,  '2009-07-04 6:49pm');
 insert into event values(5,           'Full Moon',  '2009-07-04 10:12pm');
 insert into event values(6,           'Sunrise'   , '2009-07-05 5:34am');

select * from event;

create table EventTask (
activityKey int,
activityDesc varchar(100),
startEventKey int,
endEventKey int
)

insert into eventtask values(123 ,            'Camp-out',     1 ,              5);
insert into eventtask values(234,             'Drive home',   6,               6);

select * from eventtask;

*/

select a.activitykey, a.activitydesc, b.eventkey, b.eventdesc
from
        eventtask a
join    event b on b.eventkey between a.starteventkey and a.endeventkey
order by
        a.activitykey, b.eventkey;

 activitykey     activitydesc     eventkey     eventdesc    
 --------------  ---------------  -----------  ------------ 
 123             Camp-out         1            Sunset       
 123             Camp-out         2            Sunrise      
 123             Camp-out         3            Fog          
 123             Camp-out         4            Sunset       
 123             Camp-out         5            Full Moon    
 234             Drive home       6            Sunrise      

 6 record(s) selected [Fetch MetaData: 3/ms] [Fetch Data: 1/ms] 

 [Executed: 7/7/09 4:24:34 PM EDT ] [Execution: 15/ms] 

如果你的表很大,你肯定需要在event.eventkey,eventtask.starteventkey和eventtask.endeventkey上建立索引。

请注意,索引可提高查询速度,但会降低插入和更新速度。

这是不需要event.eventkey列具有重要性的版本(更正确):

select a.activitykey, a.activitydesc, d.eventkey, d.eventdesc
from
        eventtask a
join    event     b on b.eventkey = a.starteventkey
join    event     c on c.eventkey = a.endeventkey
join    event     d on d.started between b.started and c.started
order by
        a.activitykey, d.started;

 activitykey     activitydesc     eventkey     eventdesc    
 --------------  ---------------  -----------  ------------ 
 123             Camp-out         1            Sunset       
 123             Camp-out         2            Sunrise      
 123             Camp-out         3            Fog          
 123             Camp-out         4            Sunset       
 123             Camp-out         5            Full Moon    
 234             Drive home       6            Sunrise      

 6 record(s) selected [Fetch MetaData: 2/ms] [Fetch Data: 0/ms] 

 [Executed: 7/8/09 10:01:25 AM EDT ] [Execution: 4/ms] 

答案 1 :(得分:1)

我会重新定义活动表,因此有一个startTime和一个EndTime,而不是基于随机事件。然后,如果我真的想看看那段时间发生的'事件',我会加入时间范围。从OO /灵活性的角度来看,这更有意义,尽管您会看到更高的性能成本。

declare @Event table(
id int,
name varchar(100),
[time] datetime
);

 insert into @Event values(1, 'Sunset', '2009-07-03 6:51pm');
 insert into @Event values(2, 'Sunrise', '2009-07-04 5:33am');
 insert into @Event values(3, 'Fog', '2009-07-04 5:52pm');
 insert into @Event values(4, 'Sunset', '2009-07-04 6:49pm');
 insert into @Event values(5, 'Full Moon', '2009-07-04 10:12pm');
 insert into @Event values(6, 'Sunrise', '2009-07-05 5:34am');

select * from @Event;

declare @Activity table (
id int,
name varchar(100),
startTime datetime,
endTime datetime
)

insert into @Activity values(123, 'Camp-out', '2009-07-03 6:00pm', '2009-07-05 5:00am');
insert into @Activity values(234, 'Drive home', '2009-07-05 5:00am', '2009-07-05 6:00am');

select *
from @Activity A
join @Event E on E.[time] > A.startTime and E.[time] < A.endTime
order by A.startTime

答案 2 :(得分:1)

我最近写了两种方法来优化这些查询(加入BETWEEN条件):Using CROSS APPLY to optimize joins on BETWEEN conditions

可能的查询(无法在没有样本INSERT的情况下进行测试):

SELECT et.activityKey,
et.activityDesc,
e.*
FROM Event AS e CROSS APPLY(SELECT TOP 1 * FROM EventTask  AS et
WHERE et.startEventKey <= e.started
AND e.started < endEventKey 
ORDER BY et.startEventKey
) AS et