我们假设我有一个表activities
,其中包含字段starttime (TIMESTAMP)
和stoptime (TIMESTAMP)
。我想找一个大多数活动发生的时刻。查询应该首先返回这样的时刻。
我尝试获取所有starttime
时间戳,然后为每个时间戳计算当时正在发生的活动数。然后找到最大值:
#standardSQL
SELECT
time,
(
SELECT COUNT(*)
FROM activities
WHERE starttime <= time AND time <= stoptime
) AS cnt
FROM (
SELECT DISTINCT starttime AS time
FROM activities
ORDER BY time
)
ORDER BY cnt DESC, time ASC
LIMIT 1
不幸的是它说:LEFT OUTER JOIN cannot be used without a condition that is an equality of fields from both sides of the join.
我认为在数据库世界之外的一个适当的算法是让所有starttimes
和stoptimes
以一种它们可以区分的方式将它们放入一个数组中,然后对它进行排序顺序地通过该阵列寻找最大时刻。但是,我不知道如何在SQL中表达这样的算法。
我见过this,但我认为它无论如何都有帮助。
答案 0 :(得分:2)
我已经取得了与我在问题中描述的算法相近的东西。它的工作速度相当快,但如果你发现任何更好的东西,我会很高兴看到它。
#standardSQL
SELECT time, SUM(add) OVER(ORDER BY time ASC, add DESC) AS cumsum
FROM (
SELECT starttime AS time, 1 AS add
FROM activities UNION ALL
SELECT stoptime AS time, -1 AS add
FROM activities
)
ORDER BY cumsum DESC
答案 1 :(得分:1)
考虑以下版本
从我的观点来看,它返回更实际的输出 - 即 - 同一级别的连续活动的所有期间(相应的开始和结束)
所以你现在不仅会开始,而是整个时期(开始和结束)活动最多。而不仅仅是一个,而是所有这些
#standardSQL
WITH intervals AS (
SELECT time AS start_, LEAD(time) OVER(ORDER BY time) AS end_
FROM (
SELECT DISTINCT time FROM (
SELECT starttime AS time FROM activities UNION ALL
SELECT stoptime AS time FROM activities ))
),
equals AS (
SELECT start_, end_, COUNT(1) AS cumsum
FROM intervals AS i
JOIN activities AS a
ON i.start_ >= a.starttime AND i.end_ <= a.stoptime
GROUP BY start_, end_
),
grps AS (
SELECT
start_, end_, cumsum,
IFNULL(
CAST(end_ = LEAD(start_) OVER(ORDER BY start_) AND LEAD(cumsum) OVER(ORDER BY start_) = cumsum AS INT64),
CAST(NOT((start_ = LAG(end_) OVER(ORDER BY start_) AND LAG(cumsum) OVER(ORDER BY start_) = cumsum)) AS INT64)
) AS flag
FROM equals
)
SELECT MIN(start_) AS start_, MAX(end_) AS end_, cumsum
FROM (
SELECT start_, end_, cumsum, SUM(flag) OVER(ORDER BY start_) AS grp
FROM grps
)
GROUP BY cumsum, grp
ORDER BY start_
你可以使用虚拟活动表来玩上面的
WITH activities AS (
SELECT 1 AS starttime, 3 AS stoptime UNION ALL
SELECT 1 AS starttime, 4 AS stoptime UNION ALL
SELECT 4 AS starttime, 5 AS stoptime UNION ALL
SELECT 7 AS starttime, 8 AS stoptime UNION ALL
SELECT 7 AS starttime, 10 AS stoptime UNION ALL
SELECT 8 AS starttime, 12 AS stoptime
)
或
WITH activities AS (
SELECT TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 1 MINUTE) AS starttime, TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 3 MINUTE) AS stoptime UNION ALL
SELECT TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 1 MINUTE) AS starttime, TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 4 MINUTE) AS stoptime UNION ALL
SELECT TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 4 MINUTE) AS starttime, TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 5 MINUTE) AS stoptime UNION ALL
SELECT TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 7 MINUTE) AS starttime, TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 8 MINUTE) AS stoptime UNION ALL
SELECT TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 7 MINUTE) AS starttime, TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 10 MINUTE) AS stoptime UNION ALL
SELECT TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 8 MINUTE) AS starttime, TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 12 MINUTE) AS stoptime
)