对于查询和扩展该查询以查找具有大量朋友的用户的性能,我遇到了一些问题。查询的目标是获取您朋友在过去30天内执行的顶级“活动”。以下是我的查询:
SELECT a.activity_id, b.activity_name, count(a.activity_id) as total_count
FROM friends as f
INNER JOIN activities as a on (a.user_id = f.friend_id
and a.created_at >= DATE_SUB(NOW(), INTERVAL 30 DAY)
INNER JOIN activity as b on a.activity_id = b.activity_id
WHERE f.user_id = 1 and f.is_approved = 1
GROUP by a.activity_id
ORDER by total_count DESC
LIMIT 5
无论朋友图表有多大或多小,此查询都需要25秒才能为所有用户运行。索引如下:
Table: activities
PRIMARY: [act_id] Other: [activity_id, user_id], [user_id, created_at], [created_at]
Table: friends
PRIMARY: [user_id, friend_id] Other: [user_id, is_approved], [friend_id]
Table: activity:
PRIMARY: [activity_id]
非常感谢任何帮助。
更新:这是解释
id select_type table key key_len ref rows Extra
1 SIMPLE F ref friend_lookup 5 const,const 795 Using temporary; Using filesort
1 SIMPLE A ref user_id 4 F.friend_id 58 Using where
1 SIMPLE B eq_ref PRIMARY 4 P.activty_id 1 Using where
答案 0 :(得分:2)
Robin在日期字段中是正确的。如果您正在使用某个函数,则必须计算其扫描的条目数。我在下面的方式使用MySQL变量。我将它计算为一个@StartDate并使用THAT值作为join子句。
我改变的唯一附加内容是添加“STRAIGHT_JOIN”子句。在许多情况下,我发现它帮助我和其他人优化查询。它阻止MySQL尝试以另一种方式解释查询,因为它可能首先查看Activity表,因为它是一个较小的文件,然后从那个文件反向链接。 “STRAIGHT_JOIN”告诉优化器按照您列出的顺序执行此操作。
SELECT STRAIGHT_JOIN
a.activity_id,
b.activity_name,
count(a.activity_id) as total_count
FROM
( select @StartDate := date_Sub( now(), interval 30 day ) sqlvars,
friends as f
INNER JOIN activities as a
on a.user_id = f.friend_id
and a.created_at >= @StartDate
INNER JOIN activity as b
on a.activity_id = b.activity_id
WHERE
f.user_id = 1
and f.is_approved = 1
GROUP by
a.activity_id
ORDER by
total_count DESC
LIMIT 5
每次反馈
既然如此,并且有了这个“滚动30天前”的循环,我就会求助于夜间表创建,这只不过是用户ID,活动和计数以及查询的创建而已...... / p>
create table DailyRollupActivity
select a.user_id,
a.activity_id,
count(*) total_count
from
( select @StartDate := date_Sub( now(), interval 30 day ) sqlvars,
Activities a
where
a.created_at >= @StartDate
group by
a.User_ID,
a.Activity_ID
确保您通过(用户ID和总计数)在此每日聚合表上有一个索引,然后根据按total_count降序和限制5排序的朋友ID直接查询。这需要支付小的价格以获得每晚触发/要运行的事件/脚本来创建此ONCE。查看当前日期的活动有多重要。一天活动会激烈的活动是否会扭曲您想要呈现给用户的内容?
答案 1 :(得分:0)
似乎这是一个非规范化的时间。
如果你只存储一个分离度,这很容易。在活动发生时记录每个朋友的“朋友活动”。它会将负载分配给执行活动的人员的请求。
记住这一点 - 在活动发生后,它无法“取消发生”(尽管您可能会从Feed中删除它的记录)。这允许您为了性能而采用更具事务性的日志记录方法。
答案 2 :(得分:0)
尝试将查询更改为此时开始:
$str_date = date('Y-m-d H:i:s', strtotime('today -30 Days'));
SELECT a.activity_id, b.activity_name, count(a.activity_id) as total_count
FROM ( SELECT friend_id
FROM friends
WHERE user_id = 1 and is_approved = 1) as f
INNER JOIN ( SELECT user_id, activity_id
FROM activities
WHERE created_at >= {$str_date}) as a
on a.user_id = f.friend_id
INNER JOIN activity as b on a.activity_id = b.activity_id
GROUP by a.activity_id
ORDER by total_count DESC
LIMIT 5
基本上,它会在加入其他表之前过滤user_id和is_approved。最好用PHP(或任何语言)生成日期,然后在MySQL中使用该值,然后让MySQL计算完全相同的事情(可能数千次)。