我在MySQL中有一个日志表,如下所示:
mysql> describe logtable;
+----------+------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------+------------------+------+-----+---------+-------+
| id | int(10) unsigned | NO | | NULL | |
| logdate | datetime | NO | | NULL | |
| host | varchar(100) | NO | | NULL | |
| action | varchar(100) | NO | | NULL | |
| user | varchar(100) | NO | | NULL | |
| org | varchar(100) | NO | | NULL | |
| location | varchar(1000) | NO | | NULL | |
+----------+------------------+------+-----+---------+-------+
以下是其内容的示例(并非显示所有字段):
+----------+---------------------+--------+----------------------+----------+
| id | logdate | action | user | location |
+----------+---------------------+--------+----------------------+----------+
| 13933768 | 2017-01-03 08:42:25 | login | user1@somewhere.com | place1 |
| 13934110 | 2017-01-03 08:58:38 | login | user2@somewhere.com | place2 |
| 13935532 | 2017-01-03 11:02:31 | logout | user1@somewhere.com | place1 |
| 13935622 | 2017-01-03 11:11:25 | logout | user2@somewhere.com | place2 |
| 13935772 | 2017-01-03 11:27:27 | login | user3@somewhere.com | place3 |
| 13935942 | 2017-01-03 11:52:16 | login | user4@somewhere.com | place4 |
| 13936217 | 2017-01-03 12:25:08 | logout | user3@somewhere.com | place3 |
| 13936293 | 2017-01-03 12:33:16 | logout | user4@somewhere.com | place4 |
| 13937676 | 2017-01-03 15:33:59 | login | user3@somewhere.com | place5 |
| 13937859 | 2017-01-03 15:51:53 | logout | user3@somewhere.com | place5 |
| 13942394 | 2017-01-04 08:31:26 | login | user5@somewhere.com | place2 |
| 13943946 | 2017-01-04 09:46:04 | login | user4@somewhere.com | place4 |
| 13944372 | 2017-01-04 10:17:25 | login | user4@somewhere.com | place6 |
| 13944373 | 2017-01-04 10:17:27 | login | user4@somewhere.com | place6 |
| 13944374 | 2017-01-04 10:17:29 | login | user4@somewhere.com | place6 |
| 13944375 | 2017-01-04 10:19:22 | login | user4@somewhere.com | place4 |
| 13944575 | 2017-01-04 10:36:48 | login | user4@somewhere.com | place6 |
| 13946830 | 2017-01-04 14:56:36 | login | user6@somewhere.com | place7 |
| 13947791 | 2017-01-04 16:41:26 | logout | user5@somewhere.com | place2 |
| 13947795 | 2017-01-04 16:41:59 | login | user4@somewhere.com | place4 |
| 13948181 | 2017-01-04 17:19:19 | logout | user4@somewhere.com | place7 |
| 13948200 | 2017-01-04 17:22:18 | logout | user4@somewhere.com | place4 |
| 13948201 | 2017-01-04 17:22:18 | logout | user4@somewhere.com | place6 |
| 13948824 | 2017-01-04 20:23:15 | login | user7@somewhere.com | place8 |
| 13948870 | 2017-01-04 20:44:42 | logout | user7@somewhere.com | place8 |
| 13949945 | 2017-01-05 02:26:35 | logout | user6@somewhere.com | place7 |
| 13951697 | 2017-01-05 08:49:37 | login | user8@somewhere.com | place6 |
| 13951863 | 2017-01-05 08:56:37 | login | user9@somewhere.com | place9 |
| 13951886 | 2017-01-05 08:57:06 | login | user10@somewhere.com | place9 |
+----------+---------------------+--------+----------------------+----------+
我想制作一张表格,显示每位用户登录各个地点的累积时间。从理论上讲,每个人的login
和logout
条目&地方组合应该是成对的,但客户故障和网络奇怪等各种事情有时意味着不匹配。在退出第一个位置之前,也可以登录到第二个位置,因此这些对可以相互重叠。此外,这些对可能不符合以下方式之一:
login
事件,但未匹配logout
logout
的{{1}}事件(听起来很奇怪,但软件会记录这样的中断登录login
个事件(重试),只有一个login
到目前为止,我已经在logout
和user
加入表格,并从location
次减去logout
次。我还确保login
事件的ID高于logout
事件,因为它们总是在增加。但由于可能有多个条目,我得到所有组合。举个简单的例子:
login
我天真的做法:
+----------+---------------------+--------+----------------------+----------+
| id | logdate | action | user | location |
+----------+---------------------+--------+----------------------+----------+
| 1 | 2017-01-03 08:42:25 | login | user1@somewhere.com | place1 |
| 2 | 2017-01-03 11:02:31 | logout | user1@somewhere.com | place1 |
| 3 | 2017-01-03 11:27:27 | login | user1@somewhere.com | place1 |
| 4 | 2017-01-03 12:25:08 | logout | user1@somewhere.com | place1 |
+----------+---------------------+--------+----------------------+----------+
给出了3个结果:ids 1→2,1→4和3→4。因为它可能也看起来像这样:
select * from logtable as t1
join logtable as t2
on t1.user = t2.user and t1.location = t2.location
and t1.action = 'login' and t2.action = 'logout'
and t2.id > t1.id
在成功之前登录3次尝试,然后注销,我应该只获得1个结果而不是3,并且所需答案是id 3和id 4之间的差异。
我可以描述我所追求的内容,但不能将其转换为SQL - 至少没有多个嵌套的SELECT只能运行几个小时(日志文件是8500行)。
+----------+---------------------+--------+----------------------+----------+
| id | logdate | action | user | location |
+----------+---------------------+--------+----------------------+----------+
| 1 | 2017-01-03 08:42:25 | login | user1@somewhere.com | place1 |
| 2 | 2017-01-03 08:43:35 | login | user1@somewhere.com | place1 |
| 3 | 2017-01-03 08:44:45 | login | user1@somewhere.com | place1 |
| 4 | 2017-01-03 12:25:08 | logout | user1@somewhere.com | place1 |
+----------+---------------------+--------+----------------------+----------+
和user
,找到任何登录的 last 以及退出 first > 登录后。在上面的示例1中,应该给出2个'登录/注销'事件(ids 1→2 = 2:20:06和3→4 = 0:57:41),总计为1输出行= 3:17: 47,对于例2,它应该给出一个'事件'(ids 3→4 = 3:40:23)总计为1行。
答案 0 :(得分:0)
这应该可以解决问题:
SELECT
MIN(logdate_logout) AS logdate_logout,logdate_login,user,location,id_login,id_logout
FROM (
SELECT
MAX(logdate_login) AS logdate_login, logdate_logout,user,location,id_login,id_logout
FROM (
SELECT
t1.id AS id_login,
t1.logdate AS logdate_login,
t1.user,
t1.location,
t2.id AS id_logout,
t2.logdate AS logdate_logout
FROM logtable AS t1
JOIN logtable AS t2
ON (t1.user = t2.user AND t1.location = t2.location AND t1.action = 'login' AND t2.action = 'logout' AND t2.id > t1.id)
ORDER BY t1.logdate,t2.logdate
) results_naive_approach
GROUP BY logdate_logout,user,location
) inner_query
GROUP BY logdate_login,user,location
这将确保您只有匹配的注销之前的登录以及匹配登录之后的注销