Question

有一个以下的SQL查询：

SELECT users.* FROM users users

WHERE users.name <> '' and users.email <> '' and users.phone <> ''

and users.name in (  SELECT name
            FROM users
                where name <> '' and name is not null
            GROUP BY name
            HAVING count(name) > 1 )
and users.email in (  SELECT email
            FROM users
                where email <> '' and email is not null
            GROUP BY email
            HAVING count(email) > 1 )
and users.phone in (  SELECT phone
            FROM users
                where phone <> '' and phone is not null
            GROUP BY phone
            HAVING count(phone) > 1 )
ORDER BY users.name+users.email+users.phone ASC
LIMIT 0,200

不幸的是，在庞大的数据库上运行速度非常慢。有优化此查询的选项吗？

查询结果的想法：获取在数据库中具有重复项的所有记录（例如，获取具有相同名称的用户+相同的电话+相同的电子邮件

我尝试使用内连接但似乎无法正常工作

Answer 1

如果您希望用户使用相同的名称，电话和电子邮件，请使用group by：

select u.name, u.phone, u.email, group_concat(u.user_id)
from users u
group by u.name, u.phone, u.email
having count(*) > 1;

如果您想要所有行，而不仅仅是列表中的ID，请使用join：

select u.*
from (select u.name, u.phone, u.email
      from users u
      group by u.name, u.phone, u.email
      having count(*) > 1
     ) udup join
     users u
     on u.name = udup.name and u.phone = udup.phone and u.email = udup.email
order by u.name, u.phone, u.email;

注意：这些查询不会执行原始查询的操作。相反，它基于您在文本中描述的逻辑（“例如，获取具有相同名称的用户+相同的电话+相同的电子邮件”）。

优化sql查询以获取重复项

1 个答案: