有人可以帮我优化这个mysql语句吗?

时间:2013-04-22 20:37:13

标签: mysql subquery correlated-subquery in-subquery

我有一个表用于在我的数据库中构建组。该表包含组名和ID的列表。我有另一个有用户的表,第三个表显示了这些关系。 (userid,groupid)。

情况就是这样,我需要创建一个属于特定子组的用户ID列表。因此,例如,我想要第1组,第3组和第8组中的所有用户。这很直接。它变得更复杂,我可能需要列出第1,3和8组,或者1,2和8的所有用户的列表。然后我可能需要排除符合该条件的用户,但也在组中27。

所以我有一个动态创建查询的脚本,使用适用于某一点的子查询。我有两个问题。我认为我没有正确处理不合适的部分,因为按照我的标准,最终它只是有点悬而未决。 (我认为这是我使用子选择而不是连接的结果,但我无法弄清楚如何使用连接来构建它。)

以下是带有4个ANDed OR组和2个NOT子句的查询示例。

如果有更好的方法来优化此操作,请告诉我。 (我可以在PHP中处理它的动态构建)

如果我需要澄清任何内容或提供更多详情,请告诉我。


select * from users_table where username IN
(
    select user_id from
    (
        select distinct user_id from group_user_map where user_id in 
        (
            select user_id from 
            (
                select * from 
                (
                    select count(*) as counter, user_id from  
                    (
                        (
                            select distinct(user_id) from group_user_map where group_id in (2601,119)
                        ) 
                        union all
                        (
                            select distinct(user_id) from group_user_map where group_id in (58,226)
                        ) 
                        union all
                        (
                            select distinct(user_id) from group_user_map where group_id in (1299,525)
                        ) 
                        union all
                        (
                            select distinct(user_id) from group_user_map where group_id in (2524,128)
                        ) 
                    ) 
                    thegroups group by user_id
                ) 
                getall where counter = 4
            ) 
            getuserids
        ) 
        and user_id not in 
        (
            select user_id from group_user_map where group_id in (2572)
        ) 
    ) 
    biggergroup 
);

注意,查询的第一部分是将id与用户名进行比较。这是因为我将用户名存储为另一个表中的id。 (整个事情是两个完全不同的数据库之间的链接)。

(另外,如果看起来我有任何额外的子查询,那就是试图强制mysql首先评估内部查询。)

感谢。

亚伦。

4 个答案:

答案 0 :(得分:1)

如果发布表结构和一些示例数据,将更容易理解您的问题。但是这里有一些基于您当前查询的建议,您可以使用它。

这些查询会减少您正在使用的子查询的数量。其中一个明显的变化是它与每个组获得user_id列表的方式不同:

select user_id
from group_user_map 
where group_id in (2601,119)
union all
select user_id 
from group_user_map 
where group_id in (58,226)
union all
select user_id 
from group_user_map 
where group_id in (1299,525)
union all
select user_id 
from group_user_map 
where group_id in (2524,128);

这使用UNION ALL,即使它们是重复的,也会列出所有user_id。获得user_id的此列表后,您可以通过应用count获取count(distinct user_id)并使用HAVING子句查找有4次出现的子句。

首先,您可以在WHERE子句中将当前查询合并到以下版本:

select * 
from users_table 
where username IN (select user_id
                  from
                  (
                    select user_id
                    from group_user_map 
                    where group_id in (2601,119)
                    union all
                    select user_id 
                    from group_user_map 
                    where group_id in (58,226)
                    union all
                    select user_id 
                    from group_user_map 
                    where group_id in (1299,525)
                    union all
                    select user_id 
                    from group_user_map 
                    where group_id in (2524,128)
                  ) thegroups
                  where user_id not in (select user_id 
                                        from group_user_map 
                                        where group_id in (2572)) 
                  group by userid
                  having count(distinct userid) = 4);

或者您可以在您加入的子查询中的WHERE子句中使用该查询:

select ut.* 
from users_table ut
inner join
(
  select user_id
  from
  (
    select user_id
    from group_user_map 
    where group_id in (2601,119)
    union all
    select user_id 
    from group_user_map 
    where group_id in (58,226)
    union all
    select user_id 
    from group_user_map 
    where group_id in (1299,525)
    union all
    select user_id 
    from group_user_map 
    where group_id in (2524,128)
  ) thegroups
  where user_id not in (select user_id 
                        from group_user_map 
                        where group_id in (2572)) 
  group by userid
  having count(distinct userid) = 4
) biggergroup
  on ut.username = biggergroup.user_id;

答案 1 :(得分:1)

避免使用IN子句的子选择: -

SELECT * 
FROM users_table
INNER JOIN 
(
    SELECT Sub1.user_id 
    FROM (
            SELECT COUNT(*) AS counter, user_id   
            FROM (
                SELECT distinct(user_id) FROM group_user_map WHERE group_id IN (2601,119)
                UNION ALL
                SELECT distinct(user_id) FROM group_user_map WHERE group_id IN (58,226)
                UNION ALL
                SELECT distinct(user_id) FROM group_user_map WHERE group_id IN (1299,525)
                UNION ALL
                SELECT distinct(user_id) FROM group_user_map WHERE group_id IN (2524,128)
            ) thegroups
            GROUP BY user_id
            HAVING counter = 4
    ) Sub1
    LEFT OUTER JOIN (SELECT user_id FROM group_user_map WHERE group_id IN (2572)) Sub2
    ON group_user_map.user_id = Sub2.user_id
    WHERE Sub2.user_id IS NULL
) Sub3
ON  users_table.username = Sub3.user_id

或者避免使用COUNT来检查所有4个表中是否存在用户ID,而是使用内部联接

SELECT * 
FROM users_table
INNER JOIN 
(
    SELECT Sub1.user_id 
    FROM (
        SELECT z.user_id   
        FROM (
            SELECT distinct(user_id) FROM group_user_map WHERE group_id IN (2601,119)) z
            INNER JOIN
            (SELECT distinct(user_id) FROM group_user_map WHERE group_id IN (58,226)) y ON z.user_id = y.user_id
            INNER JOIN
            (SELECT distinct(user_id) FROM group_user_map WHERE group_id IN (1299,525)) x ON z.user_id = x.user_id
            INNER JOIN
            (SELECT distinct(user_id) FROM group_user_map WHERE group_id IN (2524,128)) w ON z.user_id = w.user_id
    ) Sub1
    LEFT OUTER JOIN (SELECT user_id FROM group_user_map WHERE group_id IN (2572)) Sub2
    ON group_user_map.user_id = Sub2.user_id
    WHERE Sub2.user_id IS NULL
) Sub3
ON  users_table.username = Sub3.user_id

稍微清理第二个查询

SELECT * 
FROM users_table
INNER JOIN 
(
    SELECT z.user_id   
    FROM (SELECT distinct(user_id) FROM group_user_map WHERE group_id IN (2601,119)) z
    INNER JOIN (SELECT distinct(user_id) FROM group_user_map WHERE group_id IN (58,226)) y 
    ON z.user_id = y.user_id
    INNER JOIN (SELECT distinct(user_id) FROM group_user_map WHERE group_id IN (1299,525)) x 
    ON z.user_id = x.user_id
    INNER JOIN (SELECT distinct(user_id) FROM group_user_map WHERE group_id IN (2524,128)) w 
    ON z.user_id = w.user_id
    LEFT OUTER JOIN (SELECT user_id FROM group_user_map WHERE group_id IN (2572)) Sub2
    ON z.user_id = Sub2.user_id
    WHERE Sub2.user_id IS NULL
) Sub3
ON  users_table.username = Sub3.user_id

在下面的评论中使用您的SQL,可以将其清理为: -

select SQL_NO_CACHE id 
from users_table 
INNER JOIN ( SELECT distinct(user_id) FROM group_user_map WHERE group_id IN (0, 67) ) ij1 
ON users_table.username = ij1.user_id 
LEFT OUTER JOIN ( SELECT user_id FROM group_user_map WHERE group_id IN (0) ) Sub2 
ON users_table.username = Sub2.user_id 
WHERE Sub2.user_id IS NULL 

以同样的方式清理我的SQL: -

SELECT users_table.* 
FROM users_table
INNER JOIN (SELECT distinct(user_id) FROM group_user_map WHERE group_id IN (2601,119)) z ON users_table.username = z.user_id
INNER JOIN (SELECT distinct(user_id) FROM group_user_map WHERE group_id IN (58,226)) y ON users_table.username = y.user_id
INNER JOIN (SELECT distinct(user_id) FROM group_user_map WHERE group_id IN (1299,525)) x ON users_table.username = x.user_id
INNER JOIN (SELECT distinct(user_id) FROM group_user_map WHERE group_id IN (2524,128)) w ON users_table.username = w.user_id
LEFT OUTER JOIN (SELECT user_id FROM group_user_map WHERE group_id IN (2572)) Sub2 ON users_table.username = Sub2.user_id
WHERE Sub2.user_id IS NULL

删除子选择并直接进行连接(可能有帮助或阻碍,怀疑它将取决于每组group_id记录有多少重复的user_id记录)

SELECT DISTINCT users_table.* 
FROM users_table
INNER JOIN group_user_map z ON users_table.username = z.user_id AND z.group_id IN (2601,119)
INNER JOIN group_user_map y ON users_table.username = y.user_id AND y.group_id IN (58,226)
INNER JOIN group_user_map x ON users_table.username = x.user_id AND x.group_id IN (1299,525)
INNER JOIN group_user_map w ON users_table.username = w.user_id AND w.group_id IN (2524,128)
LEFT OUTER JOIN group_user_map Sub2 ON users_table.username = Sub2.user_id AND Sub2.group_id IN (2572)
WHERE Sub2.user_id IS NULL

答案 2 :(得分:0)

当你说“我想要第1,3和8组中的所有用户”然后写

时,你的意思并不完全清楚
select distinct(user_id) from group_user_map where group_id in (58,226)

因为英语建议您希望所有三个组中的用户,但SQL会为您提供位于任意一个组中的用户。所以你需要更清楚你想要什么。

有点难以相信您正在尝试查找位于所有4个超级组中的用户,每个超级组由2个组组成。它让我怀疑你在做什么以及为什么。

根据你真正想要遇到的内容,我可以想到几种不同的方法。显然,最简单的方法是将其分解为多个查询并将结果合并到代码中。如果组表不是太大,您可以自动加入组表,但它可能太大而无法加入3次。使用NOT EXISTS而不是使用NOT IN可能会获得更好的效果,但可能不会。您可以尝试使用CASE函数进一步利用聚合函数来计算中间表中的成功值,但这变得非常疯狂。更有可能的是,您最好重新修改数据结构。

我在现有解决方案中看到的主要问题是您创建的大量临时表。一般来说,你需要一个临时表来做一些复杂的事情,所以我会集中精力将它限制在两个表中,每个表都小于关系表。

答案 3 :(得分:0)

这是正确的查询

  select * from users_table where username IN    
            (
(select distinct(user_id) from group_user_map where group_id in (2601,119)) a 
inner join
(select distinct(user_id) from group_user_map where group_id in (58,226)) b 
on a.user_id = b.user_id inner join 
(select distinct(user_id) from group_user_map where group_id in (1299,525)) c 
on a.user_id = c.user_id inner join 
(select distinct(user_id) from group_user_map where group_id in (2524,128)) d
on a.user_id = d.user_id 
)  and user_id  not in (select user_id from group_user_map where group_id in (2572))

而不是联合所有,最后用4的计数器过滤,我用相交替换。请检查结果是否正确并且运行速度快?

摄影指导Vinit