表 user_message :
+----+---------+-------+------------+
| id | from_id | to_id | time_stamp |
+----+---------+-------+------------+
| 1 | 1 | 2 | 1414700000 |
| 2 | 2 | 1 | 1414700100 |
| 3 | 3 | 1 | 1414701000 |
| 4 | 3 | 2 | 1414701001 |
| 5 | 3 | 4 | 1414701002 |
| 6 | 1 | 3 | 1414701100 |
+----+---------+-------+------------+
我现在正试图让所有在固定时间范围内向其他用户写入最少量消息(比方说3)的用户,比方说5秒。在这个例子中,我想得到一个与此类似的结果:
+----+----+-------+
| from_id | count |
+---------+-------+
| 3 | 3 |
+---------+-------+
这样做的想法是检查垃圾邮件。一个很好的奖励是只记录共享相同内容的消息。
答案 0 :(得分:2)
以下使用join
来实现此目的:
select um.*, count(*) as cnt
from user_message um join
user_message um2
on um.from_id = um2.from_id and
um2.time_stamp between um.time_stamp and um.time_stamp + 3
group by um.id
having count(*) >= 3;
对于性能,您需要user_message(from_id, time_stamp)
上的索引。即使使用索引,如果你有一个大桌子,性能可能也不会那么好。
编辑:
实际上,写这个可能更有效的另一种方法是:
select um.*,
(select count(*)
from user_message um2
where um.from_id = um2.from_id and
um2.time_stamp between um.time_stamp and um.time_stamp + 3
) as cnt
from user_message um
having cnt >= 3;
这使用MySQL扩展,允许在非聚合查询中使用having
。
答案 1 :(得分:1)
对于每条消息(u1),查找在此秒或前四秒内从同一用户发送的所有消息(u2)。保持那些至少有3 u2的u1。最后一组由from_id显示每个from_id一条记录,其中包含最大发送消息数。
select from_id, max(cnt) as max_count
from
(
select u1.id, u1.from_id, count(*) as cnt
from user_message u1
join user_message u2
on u2.from_id = u1.from_id
-- and u2.content = u1.content
and u2.time_stamp between u1.time_stamp - 4 and u1.time_stamp
group by u1.id, u1.from_id
having count(*) >= 3
) as init
group by from_id;