如何按MySQL中的最后重复分组记录?

时间:2019-05-12 12:59:49

标签: mysql sql mariadb gaps-and-islands

我有一个表,其中包含有关用户登录的信息。我想对 last 个重复记录进行分组。例如:

+---+------------+-------------+-------------+------------------+
|   |     ip     |   platform  |   browser   |       date       |
+---+------------+-------------+-------------+------------------+
| 1 | 127.0.0.1  |   Windows   |   Chrome    | 2018-01-01 00:00 |
| 2 | 127.0.0.1  |   Windows   |   Chrome    | 2018-01-02 00:00 |
| 3 | 10.0.0.1   |   Linux     |   Firefox   | 2018-01-03 00:00 |
| 4 | 127.0.0.1  |   Windows   |   Chrome    | 2018-01-04 00:00 |
+---+------------+-------------+-------------+------------------+

将输出:

+-----+------------+-------------+-------------+-------------+
|     |     ip     |   platform  |   browser   | num_records |
+-----+------------+-------------+-------------+-------------+
| 1-2 | 127.0.0.1  |   Windows   |   Chrome    |      2      |
| 3   | 10.0.0.1   |   Linux     |   Firefox   |      1      |
| 4   | 127.0.0.1  |   Windows   |   Chrome    |      1      |
+-----+------------+-------------+-------------+-------------+

(为简单起见,我发出了日期,应该有id之类的日期范围)

请注意,id 1,2,4是相同的,但是1,24由于时间轴而被分开分组(还有另一条记录将它们分开)。

要查找重复项,我应该考虑以下几列:ip, platform, browser。如果某些内容与这些列不同,那么它就不是重复的内容。

我可以做到:

SELECT      ip, platform, browser, COUNT(1) AS num_records
FROM        users_logins
WHERE       user_id = 1
GROUP BY    ip, platform, browser

但这将对所有记录进行分组,而无需考虑时间轴。

1 个答案:

答案 0 :(得分:1)

这是一个孤岛问题。在MySQL 8+中,您可以使用行号的不同之处:

select ip, platform, browser,
       count(*) as numrecords,
       min(id), max(id),
       min(date), max(date)
from (select t.*,
             row_number() over (order by date) as seqnum,
             row_number() over (partition by ip, platform, browser order by date) as seqnum_2
      from t
     ) t
group by ip, platform, browser, (seqnum - seqnum_2)
order by min(date) desc;