要将此问题置于上下文中,我正在尝试根据事件日志计算“应用程序中的时间”。
假设下表:
user_id event_time
2 2012-05-09 07:03:38
3 2012-05-09 07:03:42
4 2012-05-09 07:03:43
2 2012-05-09 07:03:44
2 2012-05-09 07:03:45
4 2012-05-09 07:03:52
2 2012-05-09 07:06:30
我想从一组时间戳之间获得最高和最低event_time
之间的差异,这些时间戳在彼此的2分钟内(并按用户分组)。如果时间戳超出了集合的2分钟间隔,则应将其视为另一集合的一部分。
期望的输出:
user_id seconds_interval
2 7 (because 07:03:45 - 07:03:38 is 7 seconds)
3 0 (because 07:03:42)
4 9 (because 07:03:52 - 2012-05-09 07:03:43)
2 0 (because 07:06:30 is outside 2 min interval of 1st user_id=2 set)
这是我尝试过的,虽然我不能在seconds_interval
上分组(即使我可以,但我不确定这是正确的方向):
SELECT (max(tr.event_time)-min(tr.event_time)) as seconds_interval
FROM some_table tr
INNER JOIN TrackingRaw tr2 ON (tr.event_time BETWEEN
tr2.event_time - INTERVAL 2 MINUTE AND tr2.event_time + INTERVAL 2 MINUTE)
GROUP BY seconds_interval
答案 0 :(得分:4)
我认为没有一种非常简单的方法可以查询现有表格以生成所需的数据。但是,您可以维护第二个用户会话表(当然,这样做的缺点是,如果您以后想要使用不同会话超时期限的报表,则需要从头开始重新填充表):
CREATE TABLE Sessions (
user_id INT,
session_start TIMESTAMP,
session_end TIMESTAMP,
PRIMARY KEY (user_id, session_start),
FOREIGN KEY (user_id, session_start) REFERENCES TrackingRaw(user_id, event_time),
FOREIGN KEY (user_id, session_end ) REFERENCES TrackingRaw(user_id, event_time)
);
您可以使用INSERT ... SELECT ... ON DUPLICATE KEY UPDATE
:
CREATE TRIGGER after_insert_TrackingRaw AFTER INSERT ON TrackingRaw FOR EACH ROW
INSERT INTO Sessions (user_id, session_start, session_end)
SELECT NEW.user_id,
IFNULL(MAX(session_start), NEW.event_time),
NEW.event_time
FROM Sessions
WHERE user_id = NEW.user_id
AND session_end >= NEW.event_time - INTERVAL 2 MINUTE
ON DUPLICATE KEY UPDATE
session_start = session_start,
session_end = NEW.event_time;
然后,要获得所需的查询结果:
SELECT user_id, session_end - session_start AS seconds_interval FROM Sessions;
在sqlfiddle上查看。
<强>更新强>
在进一步思考之后,您当然可以在存储过程中构建这样的Sessions
表:
CREATE PROCEDURE getSessions(IN secs INT) READS SQL DATA BEGIN
DECLARE no_more_rows BOOLEAN;
DECLARE cur CURSOR FOR
SELECT user_id, event_time FROM TrackingRaw ORDER BY event_time ASC;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET no_more_rows = TRUE;
DROP TEMPORARY TABLE IF EXISTS Sessions;
CREATE TEMPORARY TABLE Sessions (
user_id INT,
session_start TIMESTAMP,
session_end TIMESTAMP,
PRIMARY KEY(user_id,session_start),
FOREIGN KEY(user_id,session_start) REFERENCES TrackingRaw(user_id,event_time),
FOREIGN KEY(user_id,session_end ) REFERENCES TrackingRaw(user_id,event_time)
);
OPEN cur;
the_loop: LOOP
FETCH cur INTO @u, @t;
IF no_more_rows THEN
CLOSE cur;
LEAVE the_loop;
END IF;
INSERT INTO Sessions
SELECT @u, IFNULL(MAX(session_start), @t), @t
FROM Sessions
WHERE user_id = @u AND session_end >= @t - secs
ON DUPLICATE KEY UPDATE
session_start = session_start, session_end = @t
END LOOP the_loop;
DEALLOCATE PREPARE stmt;
SELECT user_id, session_end - session_start AS seconds_interval FROM Sessions;
DROP TEMPORARY TABLE Sessions;
END;;
然后获得你的输出:
CALL getSessions(120); -- for a 2 minute (120 second) timeout