我有一系列可能重叠的编号时间间隔。重要提示:没有两个时间间隔同时开始,开始时间间隔严格保持。
插图:
Task 1: 1111111
Task 2: 22222222222
Task 3: 333333333333333
Task 4: 444444
Task 5: 5555555
Task 6: 66
.
.
.
0 --- time axis --->
间隔表示应执行的任务。我正在寻找一个 SQL查询来选择可以执行的任务,因为只有一个任务可以同时执行 。始终执行第一项任务。接下来,从第一个任务完成后开始的所有任务中,执行最早开始的任务。等等。
结果:可以执行任务1,3和6 。插图:
Task 1: 1111111 (yes, first)
Task 2: ----------- (no, task 1 is running when 2 begins)
Task 3: 333333333333333 (yes)
Task 4: ------ (no, task 3 is running when 4 begins)
Task 5: ------- (no, task 3 is running when 5 begins)
Task 6: 66 (yes)
.
.
.
0 --- time axis --->
使用迭代,算法很简单:在一个循环中以升序迭代间隔,记住最后一个选定间隔的结束。但是,我想问你一个 SQL查询,可能使用窗口函数,可以执行,例如。在Google BigQuery上。
任务表的架构:
task_id: integer,
start_timestamp: integer,
duration_seconds: integer.
示例数据:
task_id,start_timestamp,duration_seconds
1,1,7
2,4,11
3,12,15
4,16,6
5,24,7
6,33,2
7,37,4
8,42,13
9,47,3
10,50,2
11,54,21
12,58,14
13,66,8
14,72,7
15,80,6
16,88,16
17,92,14
18,102,3
19,109,2
20,119,10
21,123,13
22,128,21
23,138,7
24,141,17
25,146,9
26,154,17
27,160,17
28,164,13
29,173,21
30,181,7
结果 - 选定的任务:
1,3,6,7,8,12,14,15,16,19,20,23,25,27,30
样本数据说明:
Task 1: 1111111 Task 2: 22222222222 Task 3: 333333333333333 Task 4: 444444 Task 5: 5555555 Task 6: 66 Task 7: 7777 Task 8: 8888888888888 Task 9: 999 Task 10: 10 Task 11: 11xxxxxxxxxxxxxxxxxxx Task 12: 12xxxxxxxxxxxx Task 13: 13xxxxxx Task 14: 14xxxxx Task 15: 15xxxx Task 16: 16xxxxxxxxxxxxxx Task 17: 17xxxxxxxxxxxx Task 18: 18x Task 19: 19 Task 20: 20xxxxxxxx Task 21: 21xxxxxxxxxxx Task 22: 22xxxxxxxxxxxxxxxxxxx Task 23: 23xxxxx Task 24: 24xxxxxxxxxxxxxxx Task 25: 25xxxxxxx Task 26: 26xxxxxxxxxxxxxxx Task 27: 27xxxxxxxxxxxxxxx Task 28: 28xxxxxxxxxxx Task 29: 29xxxxxxxxxxxxxxxxxxx Task 30: 30xxxxx
非常感谢您的帮助。
答案 0 :(得分:2)
执行此操作的一种方法如下: 找到重叠的任务(开始时间是在其他任务的开始时间和结束时间之间),而不是提取所有其他任务。
Select task_id
FROM [table]
where Task_id not in(
Select B.task_id FROM
(SELECT task_id, start_timestamp, duration_seconds ,start_timestamp+duration_seconds as end_timestamp
FROM [table] ) as A
CROSS JOIN EACH
(SELECT task_id, start_timestamp, duration_seconds ,start_timestamp+duration_seconds as end_timestamp
FROM [table] ) as B
where B.start_timestamp>=A.start_timestamp
and B.start_timestamp<A.end_timestamp
and A.task_id<>b.task_id)
此解决方案不使用窗口函数。
使用窗口函数是可行的,但您必须假设并发并行作业的限制(在此示例中为3)。这里我使用LAG窗口函数来查找3个前任任务,并检查某个任务是否与其中一个任务重叠(开始时间介于上一个启动任务的开始时间和结束时间之间)
Select task_id
FROM
(Select task_id, start_timestamp, duration_seconds ,end_timestamp
,LAG(task_id,1) OVER (ORDER BY start_timestamp) as LAG_task_id_1
,LAG(start_timestamp,1) OVER (ORDER BY start_timestamp) as LAG_start_timestamp_1
,LAG(duration_seconds,1) OVER (ORDER BY start_timestamp) as LAG_duration_seconds_1
,LAG(end_timestamp,1) OVER (ORDER BY start_timestamp) as LAG_end_timestamp_1
,LAG(task_id,2) OVER (ORDER BY start_timestamp) as LAG_task_id_2
,LAG(start_timestamp,2) OVER (ORDER BY start_timestamp) as LAG_start_timestamp_2
,LAG(duration_seconds,21) OVER (ORDER BY start_timestamp) as LAG_duration_seconds_2
,LAG(end_timestamp,2) OVER (ORDER BY start_timestamp) as LAG_end_timestamp_2
,LAG(task_id,3) OVER (ORDER BY start_timestamp) as LAG_task_id_3
,LAG(start_timestamp,3) OVER (ORDER BY start_timestamp) as LAG_start_timestamp_3
,LAG(duration_seconds,3) OVER (ORDER BY start_timestamp) as LAG_duration_seconds_3
,LAG(end_timestamp,3) OVER (ORDER BY start_timestamp) as LAG_end_timestamp_3
FROM
(SELECT task_id, start_timestamp, duration_seconds ,start_timestamp+duration_seconds as end_timestamp
FROM [table] ))
where
(NOT(start_timestamp>=LAG_start_timestamp_1 and start_timestamp<LAG_end_timestamp_1)
and NOT(start_timestamp>=LAG_start_timestamp_2 and start_timestamp<LAG_end_timestamp_2)
and NOT(start_timestamp>=LAG_start_timestamp_3 and start_timestamp<LAG_end_timestamp_3))
OR LAG_start_timestamp_1 IS NULL
希望这会有所帮助......
答案 1 :(得分:0)
刚写了一些非常相关的东西(在oracle中),也许它可以帮助某人。 我在http://technology.amis.nl/2007/05/07/creating-a-gantt-chart-in-sql/
找到了我的解决方案这是一段有助于说明甘特图模式的代码。
WITH
PERIODS AS
( SELECT TASK_ID LABEL ,
START_TIMESTAMP START_DATE ,
START_TIMESTAMP+DURATION_SECONDS END_DATE
FROM testet
),
LIMITS AS -- determine the earliest starting date and the latest end date to determine the overall width of the chart
( SELECT MIN(START_DATE) PERIOD_START ,
MAX(END_DATE) PERIOD_END ,
80 WIDTH -- set the width as the number of characters
FROM PERIODS
),
BARS AS
( SELECT LPAD(LABEL, '20')||'|' ACTIVITY ,
(START_DATE - PERIOD_START)/(PERIOD_END - PERIOD_START) * WIDTH FROM_POS, -- the starting position for the bar
(END_DATE - PERIOD_START)/(PERIOD_END - PERIOD_START) * WIDTH TO_POS -- the end position for the bar
FROM PERIODS , LIMITS
)
SELECT ACTIVITY||
LPAD('I',FROM_POS) ||
RPAD('-', TO_POS - FROM_POS, '-') ||
'I' GANTT
FROM BARS
UNION ALL
SELECT RPAD('_',WIDTH + 22,'_')
FROM LIMITS
UNION ALL
SELECT LPAD('|',21) ||
PERIOD_START ||
LPAD(PERIOD_END,
WIDTH - 11)
FROM LIMITS;
输出:
1|--I
2|I----I
3| I------I
4| I--I
5| I--I
6| II
7| I-I
8| I-----I
9| I-I
10| II
11| I--------I
12| I-----I
13| I---I
14| I--I
15| I--I
16| I------I
17| I-----I
18| I-I
19| II
20| I----I
21| I-----I
22| I--------I
23| I--I
24| I-------I
25| I---I
26| I-------I
27| I-------I
28| I-----I
29| I--------I
30| I--I
______________________________________________________________________________________________________
|1 194