Table 1 : Contacts
id | name
------------
1 | John
2 | Shawn
3 | Rachael
Table 2 : emails
id | contact_id | email_addr
----------------------------
1 | 1 | j@gmail.com
2 | 2 | j@gmail.com
3 | 3 | r@gmail.com
假设我在email_address上找到重复项,我应该得到结果
contact_id | name | email_addr
---------------------------------
1 | John | j@gmail.com
2 | Shawn | j@gmail.com
即我应该通过重复的电子邮件获得所有联系人。
我使用了以下查询
SELECT contact_id
FROM email_address
WHERE email_addr IN (SELECT S.email_addr
FROM contacts R
INNER JOIN email_addr S ON R.id = S.contact_id
GROUP BY email_addr
HAVING COUNT(S.contact_id) > 1
);
例如,此查询需要很长时间才能执行1000条记录。 请帮助优化查询。
答案 0 :(得分:0)
您应该通过使用连接来避免使用IN,并且应该避免在子查询中使用连接:
SELECT A.contact_id, A.name, A.email_addr
FROM email_address AS A
JOIN (SELECT S.email_addr
FROM email_addr
GROUP BY email_addr
HAVING COUNT(*) > 1
) AS C
ON C.email_addr = A.email_addr;
答案 1 :(得分:0)
尝试这些索引
CREATE INDEX idx_email ON emails(email_addr,contact_id);
CREATE INDEX idx_id ON Contacts(id);
答案 2 :(得分:0)
此查询将返回电子邮件表中包含多个电子邮件的所有电子邮件
SELECT tbl2 . * FROM emails tbl1 LEFT JOIN emails tbl2 ON
tbl1.email_addr = tbl2.email_addr AND tbl1.id <> tbl2.contact_id
WHERE tbl2.id >0 GROUP BY contact_id
答案 3 :(得分:0)
这更快:
select e.contact_id, c.name,e.email_addr from Contacts as c inner join emails as e on c.id=e.contact_id group by e.email_addr having count(e.email_addr)>1
答案 4 :(得分:0)
尝试以下查询:
SELECT a.contact_id FROM email_addr a, (SELECT S.email_addr FROM contacts R JOIN email_addr S ON R.id = S.contact_id GROUP BY email_addr HAVING COUNT(S.contact_id) > 1) b WHERE a.email_addr=b.email_addr;
注意:更好的结果,应将email_addr字段编入索引。