我有一组我想要运行的查询,但是我遇到了让它们一起运行的问题。
我的设置如下,其中列名是parantheses:
我已经编写了3个查询,每个查询都完美地相互独立,但我似乎无法弄清楚如何连接它们。
查询1 - 表1中的不同电子邮件(rfi_log)
SELECT DISTINCT email, date_submitted
FROM rfi_log
WHERE date_submitted BETWEEN '[start_date]' AND '[end_date]'
查询2 - 表2中的不同电子邮件(masterstudies)
SELECT DISTINCT email
FROM orutrimdb.mastersstudies
WHERE date BETWEEN '[start_date]' AND '[end_date]'
查询3 - 加入查询,查找表1和表1中的重复电子邮件。表2
SELECT rfi_log.email as emails, orutrimdb.mastersstudies.email
FROM rfi_log
CROSS JOIN orutrimdb.mastersstudies
ON orutrimdb.mastersstudies.email=rfi_log.email
WHERE date_submitted BETWEEN '[start_date]' AND '[end_date]';
我现在的问题是我需要以某种方式组合这些查询,这样我就可以在日期范围内从两个表中获取DISTINCT电子邮件的数量,同时排除从查询3中识别的电子邮件。
我需要以下内容:
最终,我需要在日期范围内获得不同电子邮件的总数,即#34;重复数据删除"因为两个表中都有重复项。
如何实现这一目标?
答案 0 :(得分:0)
执行此操作的一种方法是使用聚合union all
。以下是每封电子邮件的重复信息:
select email, sum(isrfi) as numrfi, sum(isms) as numms
from ((select email, 1 as isrfi, 0 as isms
from rfilog
) union all
(select email, 0, 1
from orutrimdb.mastersstudies
)
) e
group by email;
顶部的聚合为您提供所需的信息:
select numrfi, numms, count(*), min(email), max(email)
from (select email, sum(isrfi) as numrfi, sum(isms) as numms
from ((select email, 1 as isrfi, 0 as isms
from rfilog
) union all
(select email, 0, 1
from orutrimdb.mastersstudies
)
) e
group by email
) e
group by numrfi, numms;
请注意,这也会在单个表中找到重复项。