SQL Server 2014 - 如何查找不同的记录

时间:2017-03-29 23:40:35

标签: sql sql-server group-by

我有一个联系人列表,其中包含名字,姓氏和电子邮件地址。某些电子邮件地址有多个名字和姓氏。我更关注电子邮件地址。我真的只想要那个电子邮件地址的顶级名称。

我的代码显然不起作用:

SELECT Salutation  
, FirstName  
, LastName  
, EmailAddress  
FROM Contact  
--GROUP BY EmailAddress  ---I know a Group by will surely help

我试过

SELECT max(Salutation)
    ,max(FirstName)
    ,max(LastName)
    ,max(EMailAddress)
FROM Contact
WHERE EMailAddress NOT LIKE ''
GROUP BY EMailAddress

这有效,但我想知道是否有更好的方法来做到这一点。

2 个答案:

答案 0 :(得分:1)

如何定义热门名称

使用max()名称可以轻松返回带有混合名称的结果,例如'Aaron Bertrand'和'Itzik Ben-Gan'将回归'Itzik Bertrand'。如果你混合Salutation那么你将永远得到'太太'来自'先生'和'太太'这可能也不合适。

使用top with tiesrow_number()

select top 1 with ties
    Salutation
  , FirstName
  , LastName
  , EmailAddress
from contact
where EmailAddress <> ''
order by row_number() over (
  partition by EmailAddress
  order by FirstName /* your 'top' criteria here, FirstName is a placeholder */
  );

cross apply()版本:

select distinct
    x.Salutation
  , x.FirstName
  , x.LastName
  , t.EmailAddress
from contact t
  cross apply (
    select top 1
        i.Salutation
      , i.FirstName
      , i.LastName
    from t as i
    where i.EmailAddress = t.EmailAddress
    order by i.FirstName
    ) as x
where t.EmailAddress <> ''
带有common table expression版本的

row_number()

;with cte as (
  select *
    , rn = row_number() over (
             partition by EmailAddress
                 order by FirstName
            )
  from contact
  where EmailAddress <> ''
)
select  
    Salutation
  , FirstName
  , LastName
  , EmailAddress
from cte
where rn = 1;

我更喜欢使用公用表表达式,但其中的查询在from子句中也同样适用:

子查询版本中的

row_number()

select  
    Salutation
  , FirstName
  , LastName
  , EmailAddress
from (
  select *
    , rn = row_number() over (
             partition by EmailAddress
                 order by FirstName
            )
    from contact
    where EmailAddress <> ''
  ) s
where rn = 1;

答案 1 :(得分:0)

尝试:

    mongo.db.connectionsPerHost=50
    mongo.db.connection.timeout=4000
    mongo.db.max.wait.time=4000
    mongo.db.socket.timeout=4000
mongo.db.readPreference=primaryPreferred

这将为您提供为每个电子邮件地址添加的最后一个。