尝试在查询中使用其他列获取四列的唯一字段值

时间:2017-01-27 06:30:02

标签: sql-server

我有这个包含30列的表。这些列的部分组将从表中选择到参考表中。列如下:

Agent_SK  <-- Hash of Agent's Name, Phone, Address, and Email.
, Agent_License_Number  <-- Nullable data field...NOT PK or UNIQUE
, Agent_Name
, Agent_Phone
, Agent_Address
, Agent_Email
, Office_Name
, Office_Phone
, Office_Address
, Office_Email
, Last_Update_Date

值得注意的是,如果所有4个字段都为NULL,则Agent_SK为空。

我需要的是Agent_Name, Phone, Address, and Email的唯一组合,其余列基于4列组合的最新更新。

我尝试了多次尝试:

Insert into property.dbo.MLSRealtor
select
    M1.Agent_SK
    , M2.Listing_Agent_License_Number
    , M2.Listing_Agent_Name
    , M2.Listing_Agent_Address
    , M2.Listing_Agent_Phone
    , M2.Listing_Agent_Email
    , M2.Office_Name
    , M2.Office_Address
    , M2.Office_Phone
    , M2.Office_Email
    , M2.Update_Timestamp
from 
(
select distinct Agent_SK
from MLS
where Agent_SK is not null
) M1
left join
(
    select
        Agent_SK
        , Listing_Agent_License_Number
        , Listing_Agent_Name
        , Listing_Agent_Address
        , Listing_Agent_Phone
        , Listing_Agent_Email
        , Office_Name
        , Office_Address
        , Office_Phone
        , Office_Email
        , Max(Update_Timestamp) as Update_Timestamp
    from MLS M
    group by Agent_SK
        , Listing_Agent_License_Number
        , Listing_Agent_Name
        , Listing_Agent_Address
        , Listing_Agent_Phone
        , Listing_Agent_Email
        , Office_Name
        , Office_Address
        , Office_Phone
        , Office_Email
) M2
on M1.Agent_SK = M2.Agent_SK;

我似乎意外地将Office信息与列表信息分组,导致重复的Agent_SK。

我需要任何Office信息来获取该Agent_SK的最新记录。 我错过了什么?

1 个答案:

答案 0 :(得分:1)

使用ROW_NUMBER来提取最新的每组记录:

ROW_NUMBER() OVER (PARTITION BY Agent_Name, Agent_Phone, Agent_Address, Agent_Email
                   ORDER BY Last_Update_Date DESC) AS rn

您可以根据上述计算字段使用rn = 1过滤。

因此,您的INSERT将如下所示:

Insert into property.dbo.MLSRealtor
select
    Agent_SK
    , Listing_Agent_License_Number
    , Listing_Agent_Name
    , Listing_Agent_Address
    , Listing_Agent_Phone
    , Listing_Agent_Email
    , Office_Name
    , Office_Address
    , Office_Phone
    , Office_Email
    , Update_Timestamp
from 
(
   select Agent_SK
        , Listing_Agent_License_Number
        , Listing_Agent_Name
        , Listing_Agent_Address
        , Listing_Agent_Phone
        , Listing_Agent_Email
        , Office_Name
        , Office_Address
        , Office_Phone
        , Office_Email
        , Update_Timestamp
        , ROW_NUMBER() OVER (PARTITION BY Listing_Agent_Name, 
                                          Listing_Agent_Phone,  
                                          Listing_Agent_Address, 
                                          Listing_Agent_Email
                             ORDER BY Update_Timestamp DESC) AS rn
   from MLS
   where Agent_SK is not null
) AS t
where t.rn = 1