更新:表和索引定义
desc activities;x
+----------------+--------------+------+-----+---------+
| Field | Type | Null | Key | Default |
+----------------+--------------+------+-----+---------+
| id | int(11) | NO | PRI | NULL |
| trackable_id | int(11) | YES | MUL | NULL |
| trackable_type | varchar(255) | YES | | NULL |
| owner_id | int(11) | YES | MUL | NULL |
| owner_type | varchar(255) | YES | | NULL |
| key | varchar(255) | YES | | NULL |
| parameters | text | YES | | NULL |
| recipient_id | int(11) | YES | MUL | NULL |
| recipient_type | varchar(255) | YES | | NULL |
| created_at | datetime | NO | | NULL |
| updated_at | datetime | NO | | NULL |
+----------------+--------------+------+-----+---------+
show indexes from activities;
+------------+------------+-----------------------------------------------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type |
+------------+------------+-----------------------------------------------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+
| activities | 0 | PRIMARY | 1 | id | A | 7263 | NULL | NULL | | BTREE |
| activities | 1 | index_activities_on_trackable_id_and_trackable_type | 1 | trackable_id | A | 7263 | NULL | NULL | YES | BTREE |
| activities | 1 | index_activities_on_trackable_id_and_trackable_type | 2 | trackable_type | A | 7263 | NULL | NULL | YES | BTREE |
| activities | 1 | index_activities_on_owner_id_and_owner_type | 1 | owner_id | A | 7263 | NULL | NULL | YES | BTREE |
| activities | 1 | index_activities_on_owner_id_and_owner_type | 2 | owner_type | A | 7263 | NULL | NULL | YES | BTREE |
| activities | 1 | index_activities_on_recipient_id_and_recipient_type | 1 | recipient_id | A | 2421 | NULL | NULL | YES | BTREE |
| activities | 1 | index_activities_on_recipient_id_and_recipient_type | 2 | recipient_type | A | 3631 | NULL | NULL | YES | BTREE |
+------------+------------+-----------------------------------------------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+
select count(id) from activities;
+-----------+
| count(id) |
+-----------+
| 7117 |
+-----------+
这是我当前的查询:
SELECT act.*, group_concat(act.owner_id order by act.created_at desc) as owner_ids
FROM (select * from activities order by created_at desc) as act
INNER JOIN users on users.id = act.owner_id
WHERE (users.city_id = 1 and act.owner_type = 'User')
GROUP BY trackable_type, recipient_id, recipient_type
order by act.created_at desc
limit 20 offset 0;
做一个解释
我已经玩了很多这个查询,包括索引等。有没有办法优化这个查询?
答案 0 :(得分:1)
我认为你根本不需要offset 0
,看起来你也可以没有子查询。如果您不使用users
表格中的字段,则可以使用in
(或exists
)明确说明:
select
a.trackable_type, a.recipient_id, a.recipient_type,
max(a.created_at) as max_created_at,
group_concat(a.owner_id order by a.created_at desc) as owner_ids
from activities as a
where
a.owner_type = 'User' and
a.owner_id in (select u.id from users as u where u.city_id = 1)
group by a.trackable_type, a.recipient_id, a.recipient_type
order by max_created_at desc
limit 20;
同样对我而言,如果您在owner_type, owner_id
activities
上创建索引(您的索引owner_id, owner_type
对您的查询不起作用)并且索引在city_id
上的users
。
答案 1 :(得分:1)
MySQL有时候很奇怪,所以我会试一试。我假设ID是用户表上的主键。
SELECT
act.trackable_type, act.recipient_id, act.recipient_type,
max(act.created_at) as max_created_at,
group_concat(act.owner_id order by act.created_at DESC) as owner_ids
FROM activities act
WHERE act.owner_id in (select id from users where city_id = 1)
AND act.owner_Type = 'User'
GROUP BY trackable_type, recipient_id, recipient_type
ORDER BY max_created_at
LIMIT 20
答案 2 :(得分:0)
首先,我会开始使查询更具可读性: - )
您不需要带有ORDER BY的派生表,而是使用列列表而不是ACT。*。
SELECT ACT.TRACKABLE_TYPE, ACT.RECIPIENT_ID, ACT.RECIPIENT_TYPE, MAX(ACT.CREATED_AT) AS max_created,
GROUP_CONCAT(ACT.OWNER_ID ORDER BY ACT.CREATED_AT DESC) AS OWNER_IDS
FROM ACTIVITIES AS ACT
JOIN USERS ON USERS.ID = ACT.OWNER_ID
WHERE (USERS.CITY_ID = 1 AND ACT.OWNER_TYPE = 'USER')
GROUP BY ACT.TRACKABLE_TYPE, ACT.RECIPIENT_ID, ACT.RECIPIENT_TYPE
ORDER BY max_created DESC
LIMIT 20 OFFSET 0;
当您将USERS上的WHERE条件移动到派生表时,它可能会有所帮助:
SELECT ACT.TRACKABLE_TYPE, ACT.RECIPIENT_ID, ACT.RECIPIENT_TYPE, MAX(ACT.CREATED_AT) AS max_created,
GROUP_CONCAT(ACT.OWNER_ID ORDER BY ACT.CREATED_AT DESC) AS OWNER_IDS
FROM ACTIVITIES AS ACT
JOIN (SELECT ID FROM USERS WHERE CITY_ID = 1) USERS
ON USERS.ID = ACT.OWNER_ID
WHERE ACT.OWNER_TYPE = 'USER'
GROUP BY ACT.TRACKABLE_TYPE, ACT.RECIPIENT_ID, ACT.RECIPIENT_TYPE
ORDER BY max_created DESC
LIMIT 20 OFFSET 0;
答案 3 :(得分:0)
您能否告诉我们您的用户表的大小,例如以下查询的结果:
select count(id) from users WHERE users.city_id = 1;
如果这是一个小数字,我建议使用
SELECT act.trackable_type, act.recipient_id, act.recipient_type, max(act.created_at) as max_created_at,
group_concat(act.owner_id order by act.created_at DESC) as owner_ids
FROM activities act
WHERE act.owner_id in (select id from users where city_id = 1)
AND act.owner_Type = 'User'
GROUP BY trackable_type, recipient_id, recipient_type
ORDER BY max_created_at
LIMIT 20
否则,使用join会更好
SELECT ACT.TRACKABLE_TYPE, ACT.RECIPIENT_ID, ACT.RECIPIENT_TYPE, MAX(ACT.CREATED_AT) AS max_created_at,
GROUP_CONCAT(ACT.OWNER_ID ORDER BY ACT.CREATED_AT DESC) AS OWNER_IDS
FROM ACTIVITIES ACT
JOIN USERS ON (USERS.CITY_ID = 1 AND USERS.ID = ACT.OWNER_ID)
WHERE ACT.OWNER_TYPE = 'USER'
GROUP BY ACT.TRACKABLE_TYPE, ACT.RECIPIENT_ID, ACT.RECIPIENT_TYPE
ORDER BY max_created DESC
LIMIT 20;
答案 4 :(得分:0)
首先,这是一个非常棘手的查询,并且基于explainig的含义以及如何改进它,可以为开发人员职位构建一个有趣的访谈=)。
MySQL使用nested loop joins,这意味着当有连接时,MySQL从表开始,并且表中的每个匹配行循环通过连接中第二个表中的相关行。
如果没有索引,那么对于每一行,MySQL都会进入磁盘并获取条件中使用的字段,并对另一个表中的每一行执行相同操作。在磁盘上运行既昂贵又耗时,最好从内存中获取信息,这样就可以从索引中获取数据。
MySQL优化器选择了连接的顺序。但是你可以通过创建特殊索引(有时提示)来暗示MySQL。
当你执行这样的事情(select * from activities order by created_at desc)
时,你将整个表加载到一个临时的无索引表中,这在任何情况下都不是一件好事。但最糟糕的是,MySQL应该从表开始连接,否则它需要在嵌套循环中检查表的每一行上的条件。
使用索引进行排序或分组(也需要排序)意味着什么?这意味着您按索引的顺序读取数据。但由于MySQL使用嵌套循环连接,因此当您排序的字段的表来自连接中的第一个表时,您只能利用索引进行排序。
created_at
字段未包含在group by
子句中,这意味着您不关心从哪个组中挑选(并且它们在组内可能是相同的)
在复杂查询中,特别是在具有分页的查询中,通常最好只选择所需行的ID,然后将id重新连接到表中以用于其余字段(排序的数据越少,它需要的速度越快。
总结一下,我们需要使用索引从activities
表开始连接,在嵌套循环中连接到users
并获取id,然后返回join活动表其余的价值观。
因此,您需要在活动(owner_type, trackable_type, recipient_id, recipient_type, owner_id, created_at)
上使用相当长的复合索引,并且可能是奢侈的,但需要索引
用户(id, city_id)
。
现在,将查询重写为:
SELECT *
FROM
(SELECT a.id, group_concat(a.owner_id order by a.created_at desc) as owner_ids
FROM activities a
JOIN users u ON a.owner_id = u.id AND u.city_id = 1
WHERE a.owner_type = 'User'
GROUP BY trackable_type, recipient_id, recipient_type
ORDER BY a.created_at desc
limit 20 offset 0) as owners
JOIN activities a USING (id);
您应该查看EXPLAIN,并且可能在子查询中使用STRAIGHT_JOIN而不是JOIN来确保正确的连接顺序。
这个解决方案似乎是一个需要资源的解决方案。但它应该是您以下实验的良好基线。您应该首先介绍一些其他字段进行分组(在索引中使用varchar 255效率不高,尤其是其中两个),因此您应该使用一些足够的前缀,并将它们显式地作为分拣器引入或强制使用索引带前缀。你可以在一个特殊的石斑鱼场上投入一个函数,它是一个函数(trackable_type,recipient_id,recipient_type)。这个owner_type = 'User'
也不是那么好,比较整数等更好。