我有一个业务案例场景,我需要查询我们的SQL“用户”表以找出重复的电子邮件地址。我可以通过以下查询来做到这一点:
SELECT
user_email, COUNT(*) as DuplicateEmails
FROM
Users
GROUP BY
user_email
HAVING
COUNT(*) > 1
ORDER BY
DuplicateEmails DESC
我得到这样的输出:
user_email DuplicateEmails
--------------------------------
abc@gmail.com 2
xyz@yahoo.com 3
现在,我被要求列出所有重复记录,并在其自己的单行中显示一些其他属性,例如名字,姓氏和用户ID。所有这些信息都存储在该表“用户”中。我很难这么做。谁能帮助我或使我朝正确的方向前进?
我的输出应如下所示:
user_email DuplicateEmails FirstName LastName UserID
------------------------------------------------------------------------------
abc@gmail.com 2 Tim Lentil timLentil
abc@gmail.com 2 John Doe johnDoe12
xyz@yahoo.com 3 brian boss brianTheBoss
xyz@yahoo.com 3 Thomas Hood tHood
xyz@yahoo.com 3 Mark Brown MBrown12
答案 0 :(得分:3)
有几种方法可以做到这一点。这是一个使用cte的人。
with FoundDuplicates as
(
SELECT
uter_email, COUNT(*) as DuplicateEmails
FROM
Users
GROUP BY
uter_email
HAVING
COUNT(*) > 1
)
select fd.user_email
, fd.DuplicateEmails
, u.FirstName
, u.LastName
, u.UserID
from Users u
join FoundDuplicates fd on fd.uter_email = u.uter_email
ORDER BY fd.DuplicateEmails DESC
答案 1 :(得分:0)
使用count() over( Partition by )
,documentation
答案 2 :(得分:0)
您可以通过以下方式解决它:
DECLARE @T TABLE
(
UserID VARCHAR(20),
FirstName NVARCHAR(45),
LastName NVARCHAR(45),
UserMail VARCHAR(45)
);
INSERT INTO @T (UserMail, FirstName, LastName, UserID) VALUES
('abc@gmail.com', 'Tim', 'Lentil', 'timLentil'),
('abc@gmail.com', 'John', 'Doe', 'johnDoe12'),
('xyz@yahoo.com', 'brian', 'boss', 'brianTheBoss'),
('xyz@yahoo.com', 'Thomas', 'Hood', 'tHood'),
('xyz@yahoo.com', 'Mark', 'Brown', 'MBrown12');
SELECT *, COUNT (1) OVER (PARTITION BY UserMail) MailCount
FROM @T;
结果:
+--------------+-----------+----------+---------------+-----------+
| UserID | FirstName | LastName | UserMail | MailCount |
+--------------+-----------+----------+---------------+-----------+
| timLentil | Tim | Lentil | abc@gmail.com | 2 |
| johnDoe12 | John | Doe | abc@gmail.com | 2 |
| brianTheBoss | brian | boss | xyz@yahoo.com | 3 |
| tHood | Thomas | Hood | xyz@yahoo.com | 3 |
| MBrown12 | Mark | Brown | xyz@yahoo.com | 3 |
+--------------+-----------+----------+---------------+-----------+
答案 3 :(得分:0)
使用这样的窗口函数:
SELECT u.*
FROM (SELECT u.*, COUNT(*) OVER (PARTITION BY user_email) as numDuplicateEmails
FROM Users
) u
WHERE numDuplicateEmails > 1
ORDER BY numDuplicateEmails DESC;
答案 4 :(得分:0)
我认为这也可以。
WITH cte (
SELECT
*
,DuplicateEmails = ROW_NUMBER() OVER (Partition BY user_email ORder by user_email)
FROM Users
)
Select * from CTE
where DuplicateEmails > 1