我有一个表(t1),其中包含电子邮件地址,用户和域名:
email user domain
joe123@domain.com joe123 domain.com
sue234@email.net sue234 email.net
... ... ...
另一个表(t2)表示是否已打开发送到地址的电子邮件:
Opened Email
0 joe123@domain.com
1 sue234@email.net
0 jack55@mybarber.com
... ...
我想将t1.domain加入t2,但只能加入超过100x的域名。
我可以创建一个具有出现次数
的表SELECT domain, count(domain) cntDomain
from table1
group by domain
结果如下:
domain cntDomain
domain.com 5000
email.net 4300
mybarber.com 67
结果表如下所示:
Opened Email domain
0 joe123@domain.com domain.com
1 sue234@email.net email.net
0 jack55@mybarber.com other
... ...
但是无法弄清楚连接(我假设它将是一个左连接来为不常发生的值创建'其他'值)并且如果它发生则需要加入值的case语句超过100倍,如果不是其他'
的值答案 0 :(得分:0)
select *
from table2 t2
inner join
(
SELECT domain, count(1) cntDomain
from table1
group by domain
having count(1) > 100
) t1 on t2.email = t1.email
答案 1 :(得分:0)
目前还不清楚第一张表中的所有电子邮件是否都在第二张。如果是,你可以这样做:
select t1.*, t2.domain
from (select t2.*, count(*) over (partition by domain) as cnt
from table2 t2
) t2 join
table1 t1
on t1.email = t2.email
where cnt > 100;
如果没有,我们可以在电子邮件地址中检查域名:
select t2.*, t1.domain
from table2 t2 left join
(select t1.domain, count(*) as cnt
from table1 t1
group by t1.domain
) t1
on t2.email like '%@' + t1.domain and
cnt > 100;
预计此版本的性能真的非常糟糕。
答案 2 :(得分:0)
此方法使用内部查询来获取计数,然后使用case语句将计数解释为域或字符串'Other'
。在一些游戏数据上进行测试以确保其有效,但我对其性能没有任何意见。
感觉有点尴尬,因为t1被查询两次;一旦获得域名,再次获得计数。无论如何,它完成了工作。
如果特定阈值发生变化,您可以将数字100换成另一个数字(或变量)。
select
t2.Opened
, t2.Email
, case when t3.cntDomain > 100 then t3.domain else 'Other' end as domain
from t2
left outer join t1 on t2.Email = t1.email
left outer join (
select t1.domain, count(1) cntDomain
from t1
left outer join t2 on t1.email = t2.email
group by t1.domain
) as t3 on t1.domain = t3.domain
修改强>
如果您不喜欢案例陈述,这种方法可能会更加优雅。使用having
语句修改内部查询。现在,由于左连接,如果计数小于阈值,t3.domain
将为空。在select语句中添加一点ISNULL
以进行空合并,您就可以了。
select
t2.Opened
, t2.Email
, ISNULL(t3.domain, 'Other')
from t2
left outer join t1 on t2.Email = t1.email
left outer join (
select t1.domain, count(1) cntDomain
from t1
left outer join t2 on t1.email = t2.email
group by t1.domain
having count(1) > 100
) as t3 on t1.domain = t3.domain
干杯!
答案 3 :(得分:0)
我认为以下查询应解决您的问题
SELECT t2.opened,
t2.Email,
CASE WHEN tempt1.email is NULL THEN 'Other' ELSE tempt1.domain END as domain
FROM t2 LEFT JOIN (SELECT email,domain
FROM t1
group by domain HAVING count(domain)>100) tempt1 on t2.Email=tempt1.email