我在数据库表中有我的样本数据如下。
id email created_at status
1 e@mail.com 2016-01-01 01:01:30 1
2 e@mail.com 2016-01-01 01:02:20 -1
3 e@mail.com 2016-01-01 01:03:30 1
4 new@mail.com 2016-01-01 01:04:00 1
5 e@mail.com 2016-01-01 01:04:30 1
6 new@mail.com 2016-01-01 02:59:08 1
7 new@mail.com 2016-01-01 03:01:24 1
8 iii@mail.com 2016-12-24 04:20:30 1
9 iii@mail.com 2016-12-24 04:23:29 -2
10 new@mail.com 2016-12-24 04:24:08 1
11 iii@mail.com 2016-12-24 04:25:29 1
12 new@mail.com 2016-12-24 04:32:08 1
13 e@mail.com 2016-12-24 05:16:30 1
14 iii@mail.com 2016-12-24 06:00:00 1
15 aa@email.com 2017-07-17 15:03:00 1
16 aa@email.com 2017-07-17 15:04:00 1
17 aa@email.com 2017-07-17 15:08:01 1
我的要求是:
a. Records are duplicated by email
b. The duplicated records are more than 2, thus 3 and upper
c. Those 3 or upper duplicated records have been inserted within 5 minutes Interval.
d. status = 1
以下是我的SQL查询,由@Strawberry提供。
SELECT DISTINCT a.*
FROM my_table a
JOIN
( SELECT x.*
, MAX(y.created_at) AS range_end
FROM my_table x
JOIN my_table y
ON y.email = x.email
AND y.id >= x.id
AND y.created_at <= x.created_at + INTERVAL 5 MINUTE
GROUP
BY x.id HAVING COUNT(*) >= 3
) b
ON b.email = a.email
AND a.created_at BETWEEN b.created_at AND b.range_end;
以上查询返回以下记录。
id email created_at status
1 e@mail.com 2016-01-01 01:01:30 1
2 e@mail.com 2016-01-01 01:02:20 -1
3 e@mail.com 2016-01-01 01:03:30 1
5 e@mail.com 2016-01-01 01:04:30 1
8 iii@mail.com 2016-12-24 04:20:30 1
9 iii@mail.com 2016-12-24 04:23:29 -2
11 iii@mail.com 2016-12-24 04:25:29 1
我尝试将"WHERE status = 1"
仅用于获取以下记录,因为它们符合我的要求。
id email created_at status
1 e@mail.com 2016-01-01 01:01:30 1
3 e@mail.com 2016-01-01 01:03:30 1
5 e@mail.com 2016-01-01 01:04:30 1
我想要检索的是由同一个电子邮件地址复制的记录,它们在5分钟内插入了2次以上,状态为1.如何"WHERE status = 1"
得到我想要的结果?< / p>
答案 0 :(得分:1)
我认为您的查询对于MySQL来说有点过于复杂:
select t.*
from my_table t join
my_table t2
on t.email = t2.email and
t2.created_at > t.created_at and
t2.created_at <= date_add(t.created_at, interval 5 minute) and
t2.status = 1
where t.id = 1
group by t.id
having count(*) >= 3;
由于id
在您的表中是唯一的,因此可以group by
该列,并从表中选择其他列。实际上,这种MySQL扩展的使用甚至与ANSI标准SQL一致。
答案 1 :(得分:0)
SELECT DISTINCT a.*
FROM service_request a
JOIN
( SELECT x.*
, MAX(y.created_at) AS range_end
FROM service_request x
JOIN service_request y
ON y.email = x.email
AND y.id >= x.id
AND y.status = x.status
AND y.created_at <= x.created_at + INTERVAL 5 MINUTE
WHERE x.status = 1
GROUP
BY x.id HAVING COUNT(*) >= 3
) b
ON b.email = a.email
AND a.created_at BETWEEN b.created_at AND b.range_end;