如何在27m行表上优化连接

时间:2014-06-02 13:21:06

标签: mysql sql

我有一个名为follow的表有2个collumns user1_id,user2_id 其中user1跟随user2。如果user1跟随user2而user2跟随user1,则他们是朋友。我必须找到朋友并将它们存放在桌子上,但我想在方便的时候完成它,因为我有27米的行。我试过了

create temporary table friends as (
select f1.* 
from follow f1 inner join follow f2
on f1.user1_id = f2.user2_id and f1.user2_id = f2.user1_id)

create temporary table friends as (
select user1_id, user2_id
from follow
where (user2_id, user1_id) in (select * from follow))
但是他们花了太多时间。有什么可以改善这个动作的表现吗?你能为这个例子建议一个更好的查询吗?

1 个答案:

答案 0 :(得分:2)

我建议采用两种方法。第一个使用group by

create table friends as
      select least(user1_id, user2_id) as user1_id, greatest(user1_id, user2_id) as user2_id
      from follow
      group by least(user1_id, user2_id), greatest(user1_id, user2_id)
      having count(*) > 1;

这会导致group by的费用昂贵,但可能是个不错的选择。

另一种是在follow(user1_id, user2_id)上创建索引并执行:

create table friends as
    select user1_id, user2_id
    from follow f
    where user1_id < user2_id and
          exists (select 1
                  from follow f2
                  where f2.user1_id = f.user2_id and f2.user2_id = f.user1_id
                 );

这会导致对表中的许多记录进行索引查找,但这可能是最佳选择。