转置连续时间数据以获得持续时间

时间:2015-03-17 21:09:16

标签: sql

我在会话中有一个操作表,每个步骤之间有一段持续时间(毫秒):

+-----------------------------------------------------------------------+
| | userid | sessionid | action sequence |   action    | milliseconds | |
| +--------+-----------+-----------------+-------------+--------------+ |
| |      1 |         1 |               1 | event start |            0 | |
| |      1 |         1 |               2 | other       |       188114 | |
| |      1 |         1 |               3 | event end   |       248641 | |
| |      1 |         1 |               4 | other       |       398215 | |
| |      1 |         1 |               5 | event start |       488284 | |
| |      1 |         1 |               6 | other       |       528445 | |
| |      1 |         1 |               7 | other       |       572711 | |
| |      1 |         1 |               8 | event end   |       598123 | |
| |      1 |         2 |               1 | event start |            0 | |
| |      1 |         2 |               2 | event end   |        54363 | |
| |      2 |         1 |               1 | other       |            0 | |
| |      2 |         1 |               2 | other       |         2345 | |
| |      2 |         1 |               1 | other       |        75647 | |
| |      3 |         1 |               2 | other       |            0 | |
| |      3 |         1 |               3 | event start |        34678 | |
| |      3 |         1 |               4 | other       |        46784 | |
| |      3 |         1 |               5 | other       |        78905 | |
| |      4 |         1 |               1 | event start |            0 | |
| |      4 |         1 |               2 | other       |         7454 | |
| |      4 |         1 |               3 | other       |        11245 | |
| |      4 |         1 |               4 | event end   |        24567 | |
| |      4 |         1 |               5 | other       |        29562 | |
| |      4 |         1 |               6 | other       |        43015 | |
| +--------+-----------+-----------------+-------------+--------------+ |

我想捕捉完整的事件 - 包含事件开始和结束的会话(有些可能有一个开始但没有结束,结束但没有开始,或者两者都没有 - 我不想要那些),以及他们的开始和结束时间。最后,我希望通过将连续的时间行转换为列来跟踪持续时间,以便我可以计算差异。理想情况下,上述数据表将转换为:

+--------+-----------+---------------+--------+--------+
| userid | sessionid | full event id | start  |  end   |
+--------+-----------+---------------+--------+--------+
|      1 |         1 |             1 |      0 | 248641 |
|      1 |         1 |             2 | 488284 | 598123 |
|      1 |         2 |             1 |      0 |  54363 |
|      4 |         1 |             1 |      0 |  24567 |
+--------+-----------+---------------+--------+--------+

我尝试过类似的事情:

select a.userid, a.sessionid, a.milliseconds as start, b.milliseconds as end
from #table a
inner join #table b
on a.userid=b.userid
and a.sessionid=b.sessionid
and a.action='event start'
and b.action='event end'

然而,这并不起作用,因为一些用户可能有多个事件开始和结束会话(如用户标识1)。我坚持如何最好地转换每个事件的时间数据。谢谢你的帮助!

1 个答案:

答案 0 :(得分:1)

所以,鉴于您的上述数据:

CREATE TABLE test_table (
  `userid` int, 
  `sessionid` int, 
  `actionSequence` int, 
  `action` varchar(11), 
  `milliseconds` int
);

INSERT INTO test_table
    (`userid`, `sessionid`, `actionSequence`, `action`, `milliseconds`)
VALUES
    (1, 1, 1, 'event start', 0),
    (1, 1, 2, 'other', 188114),
    (1, 1, 3, 'event end', 248641),
    (1, 1, 4, 'other', 398215),
    (1, 1, 5, 'event start', 488284),
    (1, 1, 6, 'other', 528445),
    (1, 1, 7, 'other', 572711),
    (1, 1, 8, 'event end', 598123),
    (1, 2, 1, 'event start', 0),
    (1, 2, 2, 'event end', 54363),
    (2, 1, 1, 'other', 0),
    (2, 1, 2, 'other', 2345),
    (2, 1, 1, 'other', 75647),
    (3, 1, 2, 'other', 0),
    (3, 1, 3, 'event start', 34678),
    (3, 1, 4, 'other', 46784),
    (3, 1, 5, 'other', 78905),
    (4, 1, 1, 'event start', 0),
    (4, 1, 2, 'other', 7454),
    (4, 1, 3, 'other', 11245),
    (4, 1, 4, 'event end', 24567),
    (4, 1, 5, 'other', 29562),
    (4, 1, 6, 'other', 43015);

以下查询可以帮助您(您走在正确的轨道上):

SELECT 
  tt1.userid, 
  tt1.sessionid, 
  tt1.actionSequence,
  tt1.milliseconds AS startMS,
  MIN(tt2.milliseconds) AS endMS,
  MIN(tt2.milliseconds) - tt1.milliseconds AS totalMS
FROM test_table tt1
INNER JOIN test_table tt2
  ON tt2.userid = tt1.userid
  AND tt2.sessionid = tt1.sessionid
  AND tt2.actionSequence > tt1.actionSequence
  AND tt2.action = 'event end'
WHERE tt1.action = 'event start'
GROUP BY tt1.userid, tt1.sessionid, tt1.actionSequence, startMS

给你这个结果集:

userid  sessionid   actionSequence  startMS         endMS   totalMS
1       1           1               0               248641  248641
1       1           5               488284          598123  109839
1       2           1               0               54363   54363
4       1           1               0               24567   24567

GROUP BY很重要,因为action = 'event end'sequence > 1sessionid = 1userid = 1两行,所以(我假设)我们想要最接近当前序列的一个,即MIN(milliseconds)。正如您所看到的,它还允许您继续使用此结果集中两列的差异,从而节省您计划的额外步骤:]

对MySQL 5.6的此查询的

Here is a SQLFiddle。您没有指定RDBMS,但我相信此查询使用的语言应该足够简单,可以在任何sql引擎中使用。