优化多级MySQL子查询(分类法和分类法)

时间:2010-03-01 21:11:17

标签: mysql subquery join

我正在阅读Nitin Borwankar的伟大tagging article,他开始考虑使用两个表格实现不同级别搜索的方法。

tags {
  id,
  tag
}

post_tags {
  id
  user_id
  post_id
  tag_id
}

我从T(U(i))的简单示例开始,这意味着所有拥有项目的用户的所有标记。我能够使用以下SQL来完成它:

/* get all tags from the users found */
SELECT t.*, vt.* FROM verse_tags as vt
LEFT JOIN tags as t ON t.id = vt.tag_id
WHERE user_id in 
(
    /* Get all user_ids that have taged this item */
    SELECT user_id FROM verse_tags WHERE verse_id = 26046 GROUP BY user_id
)
GROUP BY t.id

然后我开始使用稍微强硬的+1级深度查询。 T(U(T(u))) 是使用用户#等标签的用户标记。

/* Then get the tags of the user with tags like the user 3 */
SELECT t.id FROM post_tags as pt
LEFT JOIN tags as t ON t.id = pt.tag_id
WHERE user_id in 
(
    /* Then get users with these tags */
    SELECT pt.user_id FROM post_tags as pt
    LEFT JOIN tags as t on t.id = pt.tag_id
    WHERE tag_id in
    (
        /* get tags of user */
        SELECT t.id FROM post_tags as pt
        LEFT JOIN tags as t ON t.id = pt.tag_id
        WHERE pt.user_id = 3
        GROUP BY t.id
    )
    GROUP BY user_id
)
GROUP BY t.id

但是,因为我通常在查询中使用JOIN,所以我不确定如何优化这样的东西,或者在使用子查询时需要避免哪些设计缺陷。我甚至已经读过应该使用JOIN,但我不知道如何通过上述查询来实现。

如何优化这些查询?

更新

1)用GROUP BY替换SELECT DISTINCT。 (.74秒)

2)将WHERE in替换为WHERE exists。 (.40秒)

3)添加了索引(哎呀!)(0.09秒)

4)回到WHERE in(0.08秒)

EXPLAIN SELECT DISTINCT tag_id FROM post_tags WHERE user_id in
(
    SELECT DISTINCT user_id FROM post_tags WHERE tag_id in
    (
        SELECT DISTINCT tag_id FROM post_tags WHERE user_id = 3
    )
)

运行EXPLAIN会给我这些结果:

id  select_type     table       type        possible_keys   key key_len ref rows    Extra
1   PRIMARY         post_tags   index       NULL        tag_id  4   NULL    14  Using where
2   DEPENDENT SUBQUERY  post_tags   index_subquery  user_id     user_id 4   func    1   Using where
3   DEPENDENT SUBQUERY  post_tags   index_subquery  user_id,tag_id  tag_id  4   func    1   Using where

1 个答案:

答案 0 :(得分:1)

据我所知这是解决方案:

SELECT DISTINCT(`t`.`id`) FROM `post_tags` as `pt`
    left join `tags` as t on `t`.`id` = `pt`.`tag_id`
    where `pt`.`user_id` in(
        SELECT distinct(`pt`.`user_id`) FROM `post_tags` as `pt`
             LEFT JOIN `tags` as `t` on `t`.`id` = `pt`.`tag_id`
              WHERE `pt`.`tag_id` in(
                 SELECT distinct(`tag_id`) FROM `post_tags` 
                 WHERE pt.user_id = 3
            )
    )