我收集了视频中发生的某些事件的数据。我需要弄清楚该视频中发生的任何事件的总时间,但我无法重复计算同时发生多个事件的时段。下图显示了这种情况。
在这种情况下,有4个事件占整个10秒视频的7秒。简单地将每个事件的总时间总和错误地产生3 + 2 + 3 + 2 = 10 out of 10 seconds
。我正在办公的桌子有:
video_id, video_length, event_id, event_start, event_end
有谁知道如何编写查询以获得我正在寻找的结果
答案 0 :(得分:1)
这称为间隙和岛屿问题。基本上,您需要查找重叠记录组。您可以通过在某些事情开始时识别第一条记录来完成此操作然后一组就是这些旗帜的总和。
假设两个事件不同时开始,以下查找每个具有开始和结束时间的“孤岛”。
select video_id, min(event_start) as event_start, max(event_end) as event_end
from (select e.*,
sum(IsNotOverlap) over (partition by video_id order by event_start) as grp
from (select e.*,
(case when exists (select 1 from events e2 where e2.event_start < e.event_start and e2.event_end > e.event_start and e2.video_id = v.video_id)
then 0 else 1
end) as IsNotOverlap
from events e
) e
) e
group by video_id, grp;
您可以将其用作子查询或CTE来获取给定视频的总时间。
答案 1 :(得分:0)
即使两个事件具有相同的开始日期,结束日期,或者即使一个事件完全包含在另一个事件中,这也有效:
Oracle安装程序:
CREATE TABLE videos ( video_id, video_length, event_id, event_start, event_end ) AS
SELECT 1, 10, 1, 1, 4 FROM DUAL UNION ALL
SELECT 1, 10, 2, 1, 3 FROM DUAL UNION ALL -- Same start date
SELECT 1, 10, 3, 2, 4 FROM DUAL UNION ALL -- Same end date
SELECT 1, 10, 4, 3, 6 FROM DUAL UNION ALL
SELECT 1, 10, 5, 7, 9 FROM DUAL UNION ALL
SELECT 1, 10, 6, 8, 8.5 FROM DUAL; -- Contained in previous event
<强>查询强>:
SELECT video_id,
SUM( event_duration ) AS event_duration,
MAX( video_length ) AS video_length
FROM (
SELECT video_id,
video_length,
end_date
- LAST_VALUE( start_date ) IGNORE NULLS
OVER ( PARTITION BY video_id
ORDER BY ROWNUM ) AS event_duration
FROM (
SELECT video_id,
video_length,
CASE WHEN 1 = lvl
AND 1 = SUM( lvl ) OVER ( PARTITION BY video_id
ORDER BY event_date, lvl DESC, ROWNUM )
THEN event_date
END AS start_date,
CASE WHEN 0 = SUM( lvl ) OVER ( PARTITION BY video_id
ORDER BY event_date, lvl DESC, ROWNUM )
THEN event_date
END AS end_date
FROM videos
UNPIVOT ( event_date FOR lvl IN ( event_start AS 1, event_end AS -1 ) )
)
)
GROUP BY video_id;
<强>输出强>:
VIDEO_ID EVENT_DURATION VIDEO_LENGTH
---------- -------------- ------------
1 7 10
答案 2 :(得分:0)
变体1复杂: (始终按video_id进行分区,按start_date排序。)首先从end_date运行MAX,然后将事件start与上一条记录的max进行比较。当开始&lt; =运行max end_date时,存在重叠。然后我们使用运行总和来制作重叠间隔组,最后我们对这些组进行分组。
SELECT video_id, video_length, SUM (new_end - new_start) total_time
FROM ( SELECT video_id, video_length, MIN (event_start) new_start, MAX (new_end) new_end
FROM (SELECT b.*, SUM (counting) OVER (PARTITION BY video_id ORDER BY event_start) time_group
FROM (SELECT a.*, CASE WHEN LAG (new_end, 1) OVER (PARTITION BY video_id ORDER BY event_start) >= event_start THEN NULL ELSE 1 END counting
FROM (SELECT x.*, MAX (event_end) OVER (PARTITION BY video_id ORDER BY event_start) new_end
FROM videos x) a) b) c
GROUP BY video_id, video_length, time_group)
GROUP BY video_id, video_length
ORDER BY video_id
变式2:获取重叠时段(或同一时段)的开始和结束,只获取不同的值并总结时间:
SELECT video_id, SUM (new_end - new_start) total_time
FROM (SELECT DISTINCT a.video_id,
(SELECT MIN (event_start)
FROM videos b
WHERE ( (a.event_start BETWEEN b.event_start AND b.event_end) OR (a.event_end BETWEEN b.event_start AND b.event_end)) AND a.video_id = b.video_id)
new_start,
(SELECT MAX (event_end)
FROM videos b
WHERE ( (a.event_start BETWEEN b.event_start AND b.event_end) OR (a.event_end BETWEEN b.event_start AND b.event_end)) AND a.video_id = b.video_id)
new_end
FROM videos a)
GROUP BY video_id
变式3:它是变体2,但经过修改以使用Oracle 12中的新功能LATERAL Inline Views
SELECT video_id, SUM (new_end - new_start) total_time
FROM (SELECT DISTINCT a.video_id, b.new_start, b.new_end
FROM videos a,
LATERAL (SELECT MIN (event_start) new_start, MAX (event_end) new_end
FROM videos b
WHERE ( (a.event_start BETWEEN b.event_start AND b.event_end) OR (a.event_end BETWEEN b.event_start AND b.event_end)) AND a.video_id = b.video_id) b)
GROUP BY video_id
您也可以使用CROSS APPLY Join或OUTER APPLY Join,这会产生相同的结果,因为子查询总是返回一行。