以下是我需要帮助的查询的小部分。此部分生成一个记录计数,其中EmailAddress和DateOfBirth都是重复的。
注释掉的行应该产生一个记录计数,其中EmailAddress被复制但DateOfBirth不同。即识别共享电子邮件地址的用户(假设两个用户具有不同的出生日期)。
SELECT
u.EmailAddress,
u.DateOfBirth,
COUNT(*) over (partition by u.EmailAddress, DateOfBirth) AS EmailAndDoBDup,
--COUNT(*) where EmailAddress is duplicate but DateOfBirth is unique (in the aggregated results)
FROM [User] AS u
由于
答案 0 :(得分:0)
您可以使用子查询执行此操作。我不认为有一种方法可以使用单个窗口函数来执行此操作:
SELECT u.EmailAddress, u.DateOfBirth,
EmailAndDoBDup,
SUM(CASE WHEN EmailAndDoBDup = 1 THEN 1 ELSE 0 END) OVER (PARTITION BY EmailAddress) as YourCol
FROM (SELECT u.*,
COUNT(*) OVER (partition by u.EmailAddress, DateOfBirth) as EmailAndDoBDup
FROM [User] u
) u;
编辑:
如果您希望每个电子邮件地址和DOB有一行,则可以将其标记为聚合查询:
SELECT u.EmailAddress, u.DateOfBirth, COUNT(*) as EmailAndDoBDup,
SUM(CASE WHEN COUNT(*) = 1 THEN 1 ELSE 0 END) OVER (PARTITION BY EmailAddress) as YourCol
FROM [User] u
GROUP BY u.EmailAddress, u.DateOfBirth;
这不需要子查询,但它可能不适合您更复杂的查询。
答案 1 :(得分:0)
而不是在SELECT部分中进行,我会将外连接保留为两组,如:
LEFT OUTER join
(SELECT
EmailAddress,
DateOfBirth
FROM
USER
GROUP BY
EmailAddress,
DateOfBirth
HAVING
COUNT(DISTINCT ID) > 1) dupEmailDOB
...
LEFT OUTER JOIN
(SELECT
EmailAddress
FROM
USER
GROUP BY
EmailAddress
HAVING
COUNT(DISTINCT DateOfBirth) > 1) emailMultipleDOBs
因为如果您需要添加其他条件
,它更容易维护