以下2个查询的总和为42081:
查询1:
SELECT COUNT(DISTINCT SIREN) AS new_column FROM CDEdelux
WHERE Email IS NOT NULL
AND Email <> ""
AND Email <> " "
AND Email <> ''
AND Email <> ' '
AND Email LIKE "%@%.%"
AND Email NOT LIKE "_@%.%"
AND Email NOT LIKE "%bpi%"
AND Email NOT LIKE "%BPI%"
AND Email NOT LIKE "%inconnu%"
AND Email NOT LIKE "%tempo%"
AND Email NOT LIKE "%attente%"
AND Email NOT LIKE "%xx%"
AND Email NOT LIKE "%nsp%"
AND Email NOT LIKE "%contact%"
AND Email NOT LIKE "%info%"
AND Email NOT LIKE "%recuperer%"
查询2(相反的查询):
SELECT COUNT(DISTINCT SIREN) AS new_column FROM CDEdelux
WHERE Email IS NOT NULL
AND Email <> ""
AND Email <> " "
AND Email <> ''
AND Email <> ' '
AND (Email NOT LIKE "%@%.%"
OR Email LIKE "_@%.%"
OR Email LIKE "%bpi%"
OR Email LIKE "%BPI%"
OR Email LIKE "%inconnu%"
OR Email LIKE "%tempo%"
OR Email LIKE "%attente%"
OR Email LIKE "%xx%"
OR Email LIKE "%nsp%"
OR Email LIKE "%contact%"
OR Email LIKE "%info%"
OR Email LIKE "%recuperer%")
但是,由于它们应该是对立且互补的,因此应该给我39 206,这是该查询的结果:
查询3(总查询):
SELECT COUNT(DISTINCT SIREN) AS new_column FROM CDEdelux
WHERE Email IS NOT NULL
AND Email <> ""
AND Email <> " "
AND Email <> ''
AND Email <> ' '
为什么第一个数字高于第二个?
答案 0 :(得分:1)
好吧,很明显,给定的SIREN
可能有多封符合任一条件的电子邮件。您可以通过以下方式查看重复的电子邮件:
select distinct siren, email
from CDEdelux d1
where exists (select 1 from CDEdelux d2 where d2.siren = d.siren and d2.email <> d.email);
如果您计数了不同的email
,则这些数字应加起来。