按时间戳之间的间隔对时间戳进行分组,然后从组MySQL进行计算

时间:2012-06-29 03:54:34

标签: mysql timestamp

要将此问题置于上下文中,我正在尝试根据事件日志计算“应用程序中的时间”。

假设下表:

user_id   event_time
2         2012-05-09 07:03:38
3         2012-05-09 07:03:42
4         2012-05-09 07:03:43
2         2012-05-09 07:03:44
2         2012-05-09 07:03:45
4         2012-05-09 07:03:52
2         2012-05-09 07:06:30

我想从一组时间戳之间获得最高和最低event_time之间的差异,这些时间戳在彼此的2分钟内(并按用户分组)。如果时间戳超出了集合的2分钟间隔,则应将其视为另一集合的一部分。

期望的输出:

user_id  seconds_interval
2        7     (because 07:03:45 - 07:03:38 is 7 seconds)
3        0     (because 07:03:42)
4        9     (because 07:03:52 - 2012-05-09 07:03:43)
2        0     (because 07:06:30 is outside 2 min interval of 1st user_id=2 set)

这是我尝试过的,虽然我不能在seconds_interval上分组(即使我可以,但我不确定这是正确的方向):

SELECT (max(tr.event_time)-min(tr.event_time)) as seconds_interval
FROM some_table tr
INNER JOIN TrackingRaw tr2 ON (tr.event_time BETWEEN 
   tr2.event_time - INTERVAL 2 MINUTE AND tr2.event_time + INTERVAL 2 MINUTE) 
GROUP BY seconds_interval

1 个答案:

答案 0 :(得分:4)

我认为没有一种非常简单的方法可以查询现有表格以生成所需的数据。但是,您可以维护第二个用户会话表(当然,这样做的缺点是,如果您以后想要使用不同会话超时期限的报表,则需要从头开始重新填充表):

CREATE TABLE Sessions (
  user_id INT,
  session_start TIMESTAMP,
  session_end   TIMESTAMP,
  PRIMARY KEY (user_id, session_start),
  FOREIGN KEY (user_id, session_start) REFERENCES TrackingRaw(user_id, event_time),
  FOREIGN KEY (user_id, session_end  ) REFERENCES TrackingRaw(user_id, event_time)
);

您可以使用INSERT ... SELECT ... ON DUPLICATE KEY UPDATE

的触发器自动填充/更新此类表格
CREATE TRIGGER after_insert_TrackingRaw AFTER INSERT ON TrackingRaw FOR EACH ROW
  INSERT INTO Sessions (user_id, session_start, session_end)
    SELECT NEW.user_id,
           IFNULL(MAX(session_start), NEW.event_time),
           NEW.event_time
    FROM   Sessions
    WHERE  user_id = NEW.user_id
       AND session_end >= NEW.event_time - INTERVAL 2 MINUTE
  ON DUPLICATE KEY UPDATE
    session_start = session_start,
    session_end   = NEW.event_time;

然后,要获得所需的查询结果:

SELECT user_id, session_end - session_start AS seconds_interval FROM Sessions;

sqlfiddle上查看。


<强>更新

在进一步思考之后,您当然可以在存储过程中构建这样的Sessions表:

CREATE PROCEDURE getSessions(IN secs INT) READS SQL DATA BEGIN
  DECLARE no_more_rows BOOLEAN;
  DECLARE cur CURSOR FOR
    SELECT user_id, event_time FROM TrackingRaw ORDER BY event_time ASC;
  DECLARE CONTINUE HANDLER FOR NOT FOUND SET no_more_rows = TRUE;

  DROP   TEMPORARY TABLE IF EXISTS Sessions;
  CREATE TEMPORARY TABLE Sessions (
    user_id INT,
    session_start TIMESTAMP,
    session_end   TIMESTAMP,
    PRIMARY KEY(user_id,session_start),
    FOREIGN KEY(user_id,session_start) REFERENCES TrackingRaw(user_id,event_time),
    FOREIGN KEY(user_id,session_end  ) REFERENCES TrackingRaw(user_id,event_time)
  );

  OPEN cur;
  the_loop: LOOP
    FETCH cur INTO @u, @t;
    IF no_more_rows THEN
      CLOSE cur;
      LEAVE the_loop;
    END IF;

    INSERT INTO Sessions
      SELECT @u, IFNULL(MAX(session_start), @t), @t
      FROM   Sessions
      WHERE  user_id = @u AND session_end >= @t - secs
    ON DUPLICATE KEY UPDATE
      session_start = session_start, session_end = @t
  END LOOP the_loop;

  DEALLOCATE PREPARE stmt;
  SELECT user_id, session_end - session_start AS seconds_interval FROM Sessions;
  DROP TEMPORARY TABLE Sessions;
END;;

然后获得你的输出:

CALL getSessions(120); -- for a 2 minute (120 second) timeout