我有一张记录感兴趣事件的开始时间和结束时间的表格:
CREATE TABLE event_log (start_time DATETIME, end_time DATETIME);
INSERT INTO event_log VALUES ("2013-06-03 09:00:00","2013-06-03 09:00:05"), ("2013-06-03 09:00:03","2013-06-03 09:00:07"), ("2013-06-03 09:00:10","2013-06-03 09:00:12");
+---------------------+---------------------+
| start_time | end_time |
+---------------------+---------------------+
| 2013-06-03 09:00:00 | 2013-06-03 09:00:05 |
| 2013-06-03 09:00:03 | 2013-06-03 09:00:07 |
| 2013-06-03 09:00:10 | 2013-06-03 09:00:12 |
+---------------------+---------------------+
我正在寻找一种创建“时间序列”表的方法,其中一列是时间索引,另一列是当时正在进行的事件的计数。我可以使用子查询和生成器来完成它:
SET @first_time := (SELECT MIN(start_time) FROM event_log);
SET @last_time := (SELECT MAX(end_time) FROM event_log);
CREATE OR REPLACE VIEW generator_16
AS SELECT 0 n UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL
SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL
SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL
SELECT 9 UNION ALL SELECT 10 UNION ALL SELECT 11 UNION ALL
SELECT 12 UNION ALL SELECT 13 UNION ALL SELECT 14 UNION ALL
SELECT 15;
CREATE TABLE time_series (t DATETIME, event_count INT(11))
SELECT @first_time + INTERVAL n SECOND t, NULL AS event_count
FROM generator_16
WHERE @first_time + INTERVAL n SECOND <= @last_time;
UPDATE time_series
SET event_count= (SELECT COUNT(*) FROM event_log
WHERE start_time<=t AND end_time>=t);
+---------------------+-------------+
| t | event_count |
+---------------------+-------------+
| 2013-06-03 09:00:00 | 1 |
| 2013-06-03 09:00:01 | 1 |
| 2013-06-03 09:00:02 | 1 |
| 2013-06-03 09:00:03 | 2 |
| 2013-06-03 09:00:04 | 2 |
| 2013-06-03 09:00:05 | 2 |
| 2013-06-03 09:00:06 | 1 |
| 2013-06-03 09:00:07 | 1 |
| 2013-06-03 09:00:08 | 0 |
| 2013-06-03 09:00:09 | 0 |
| 2013-06-03 09:00:10 | 1 |
| 2013-06-03 09:00:11 | 1 |
| 2013-06-03 09:00:12 | 1 |
+---------------------+-------------+
有更有效的方法吗?此方法需要每个时间索引的子查询。例如,是否有一种方法可以实现每个“event_log”记录需要一个子查询?我的真正问题有500k时间索引条目和1k事件;这比我想要的时间长一点(大约90秒)。
“生成器”代码段来自http://use-the-index-luke.com/blog/2011-07-30/mysql-row-generator。显然,较大的问题需要一个较大的发电机,如64k版本或1M版本。
答案 0 :(得分:0)
唯一的变化发生在start_time和end_time。 所以,如果你要
select distinct start_time As time_point from event_log
UNION
select distinct end_time As time_point from event_log
...这将为您提供需要快照的所有“点”。
如果您在临时表中创建它(比如TEMP_POINTS),并且如果返回到event_log则加入,您应该能够计算每个“点”的事件数。
CREATE TABLE NON_ZERO_POINTS (t DATETIME, event_count INT(11))
select time_point, count(*)
from TEMP_POINTS
join event_log on time_point between start_time and end_time
group by time_point
可能值得在NON_ZERO_POINTS上创建索引
然后,您可以在更新中使用NON_ZERO_POINTS:
UPDATE time_series
SET event_count= (SELECT event_count FROM NON_ZERO_POINTS
WHERE t=time_point);
另外,你需要更新time_series吗?如果没有,您可以在查询中使用它:
select t, coalesce(event_count)
from time_series
left join FROM NON_ZERO_POINTS
on t=time_point