在MySQL中查询一系列连续事件

时间:2020-05-29 12:05:50

标签: mysql sql database group-by

我有一个带有项目和时间戳的事件表。我想查询所有连续项目系列。如果一个项目连续发生超过一次,则该项目应列出几次。我还想获得每个系列的开始和结束时间以及持续时间。

示例:

| project   | created_at              |
|-----------|-------------------------|
| project a | 2020-05-29 10:00:00.000 |
| project a | 2020-05-29 10:00:01.167 |
| project a | 2020-05-29 10:00:03.954 |
| project a | 2020-05-29 10:00:10.055 |
| project b | 2020-05-29 10:05:00.000 |
| project b | 2020-05-29 10:06:01.049 |
| project b | 2020-05-29 10:06:30.197 |
| project a | 2020-05-29 10:07:05.167 |
| project a | 2020-05-29 10:07:18.680 |

我想收到以下输出:

| project   | start                   | end                     | duration     |
|-----------|-------------------------|-------------------------|--------------|
| project a | 2020-05-29 10:00:00.000 | 2020-05-29 10:00:10.055 | 00:00:10.055 |
| project b | 2020-05-29 10:05:00.000 | 2020-05-29 10:06:30.197 | 00:01:30:197 |
| project a | 2020-05-29 10:07:05.167 | 2020-05-29 10:07:18.680 | 00:00:13.513 |

到目前为止,我有以下查询:

SELECT 
project,
created_at AS "Start", 
Max(created_at) AS "End", 
TIMEDIFF(MAX(created_at), created_at) AS "Duration"
FROM results GROUP BY project;

这给了我以下输出:

| project   | start                   | end                     | duration     |
|-----------|-------------------------|-------------------------|--------------|
| project a | 2020-05-29 10:00:00.000 | 2020-05-29 10:07:18.680 | 00:07:18.680 |
| project b | 2020-05-29 10:05:00.000 | 2020-05-29 10:06:30.197 | 00:01:30:197 |

问题是我只能通过group by获得两个输出。这反过来会弄乱要输出的开始日期和结束日期以及持续时间。

是否可以解决此问题,以便获得所需的输出?

1 个答案:

答案 0 :(得分:1)

这是一个空白与孤岛问题的示例。行号的不同应满足您的要求:

SELECT project, MIN(created_at) as start_dt, max(created_at) as end_dt
       TIMEDIFF(MAX(created_at), created_at) AS Duration
FROM (SELECT r.*,
             ROW_NUMBER() OVER (PARTITION BY project ORDER BY created_at) as seqnum_p,
             ROW_NUMBER() OVER (ORDER BY created_at) as seqnum
      FROM results r
     ) r
GROUP BY project, (seqnum - seqnum_p)
ORDER BY MIN(created_at);