如何优化这个“IN”和“where NOT IN”查询?

时间:2011-07-23 16:45:33

标签: mysql sql query-optimization

以下查询/查询获取用户访问过的城市,获取用户访问过的地点;并返回那些用户尚未去过的城市的地方。

// I get the city_id and object_id. Each vote has the place_id and its city_id.
  SELECT DISTINCT city_id as city_id, object_id as object_id
    FROM vote
   WHERE object_model = 'Place'
     AND user_id = 20
ORDER BY created_at desc

// I build an array with city_ids and another with object_ids
$city_ids = array(...);
$place_ids = array(...);

我得到了用户未曾去过的城市的地方 - 1秒

  SELECT id, title
    FROM place
   WHERE city_id IN ($city_ids)
     AND id NOT IN ($place_ids)
ORDER BY points desc
   LIMIT 0,20

EXPLAIN SQL

select_type table   type    possible_keys           key             key_len   ref    ows     Extra
-----------------------------------------------------------------------------------------------------------
SIMPLE      p       range   PRIMARY,city_id_index    city_id_index    9         NULL  33583    Using where; Using filesort

另一种优化尝试是使用LEFT JOIN / IS NULL和子查询进行一次查询,但需要更长时间(30秒以上)

   SELECT id, title 
     FROM place AS p
LEFT JOIN vote v ON v.object_id = p.id
                AND v.object_model = 'Place'
                AND v.user_id = 20
    WHERE p.city_id IN (SELECT city_id 
                          FROM vote 
                         WHERE user_id = 20 
                           AND city_id != 0)
      AND v.id is null
 ORDER BY p.points desc
    LIMIT 0, 20

如何查询/查询我们可以为每个用户提供500个城市和1000个地方的数组?当存在许多ID时,哪个是哪里以及哪里不是IN的最佳替代方案?

4 个答案:

答案 0 :(得分:1)

我不是MySQL专家,但我的查询看起来并不太复杂。我不会专注于查询,而是查看索引。也许以下索引会有所帮助:

CREATE INDEX vote_index1 ON vote (user_id, city_id)
CREATE INDEX vote_index2 ON vote (object_id, object_model, user_id)

答案 1 :(得分:0)

如果要查询2个属性,则需要连接2个表而不仅仅是1个表。另外我想知道object_id是什么?

SELECT id, title 
 FROM place AS p
LEFT JOIN vote v ON v.object_id = p.id
            AND v.object_model = 'Place'
            AND v.user_id = 20
LEFT JOIN place AS P1 on V.city_id = P1.city_id
WHERE v.id is null
ORDER BY p.points desc
 LIMIT 0, 20

答案 2 :(得分:0)

不要使用IN运算符,只需尝试通过加入所有需要的表来解决。 IN可以通过我相信的正常连接来完成,而NOT IN可以通过以下方式完成:

select *
from a left join b using (field)
where b.field is NULL

通过这种方式,您可以从表a中获取表b中没有相应记录的所有记录。

答案 3 :(得分:0)

使用mysql时,你必须记住在处理IN()子查询(或其他任何事情)时它非常愚蠢。所以你应该重写你的第二次尝试:

SELECT id, title 
FROM 
 (SELECT DISTINCT city_id FROM vote WHERE user_id = 20) v
JOIN places p USING (city_id)
LEFT JOIN vote v2 ON (v2.object_id = p.id AND v2.user_id = 20)
WHERE v2.id IS NULL
ORDER BY p.points desc
LIMIT 0, 20

请注意,“city_id!= 0”没用,因为从投票到城市有一个外键,所以vote.city_id不能为0.但它可能是NULL。

此外,数据库设计可能是错误的:城市应该有自己的表,“表名+ id”列是个坏主意等等。