SQL服务器重复记录问题

时间:2017-01-20 05:43:21

标签: sql-server duplicates

我有两个源表CustPCustS(此表仅用于存储电子邮件ID,这是当前设计的方式)

CustP

CustID  Fname      Lname        Phone
------------------------------------------
100     John        Doe         1234567890
200     John        Doe         1234567890
300     John        Doe         NULL

CustS

CustID  Fname       Lname       Email
--------------------------------------------
100     John        Doe         NULL
200     John        Doe         a@a.com   
300     John        Doe         a@a.com   

我想根据以下标准识别上述2个表中的重复记录:

  • if(FName,Lname和电话匹配)或(FName,Lname和电子邮件匹配)

以下是我用来生成重复结果集的步骤

drop table #AllCustomer

select 
    cp.CustID, cp.Fname, cp.Lname, cp.Phone, cs.Email,
    ROW_NUMBER() over (order by cp.Fname) RN
into 
    #AllCustomer
from 
    CustP cp
inner join 
    CustS cs on cp.CustID = cs.CustID

 --Combining the customer and matched customer into a temp table
 Select   
     A.CustID CustID, B.CustID MatchedCustID,
     A.Fname FirstName,
     A.Lname SurName,
     A.Phone Phone,
     A.Email Email,
     B.Fname MatchedFirstName,
     B.Lname MatchedSurName,
     B.Phone MatchedPhone,
     B.Email MatchedEmail           
 into 
     #AllMatchedCustomers
 from 
     #AllCustomer A
 inner join
     #AllCustomer B on (A.Fname = B.Fname
                        and A.Lname = B.Lname
                        and A.CustID <> B.CustID
                        and A.RN < B.RN)
 where 
     A.CustID <> B.CustID
     and (((1 = case  
                   when isnull(A.Phone, 1) in (isnull(B.Phone, 2))
                      then 1
                      else 0
                end))
           or
           (A.Fname = B.Fname and A.Lname = B.Lname and 
            isnull(A.Email, 'A') = isnull(B.Email, 'B'))
          )

结果如下所示

CustID  MatchedCustID   FirstName   SurName Phone   Email   MatchedFirstName    MatchedSurName  MatchedPhone    MatchedEmail
100 200 John        Doe         1234567890  NULL    John        Doe         1234567890  a@a.com   
200 300 John        Doe         1234567890  a@a.com     John        Doe         NULL    a@a.com   

我需要帮助确定Cust100是否与Cust200匹配,电话号码和cust200通过电子邮件与cust300相关,然后结果应该包括另外一行显示Cust100和cust300(因为a = b,b = c因此a = C)。 (这里结果集的第3行将是100和300)。或者有哪些替代方法?感谢您的帮助,提前谢谢。

0 个答案:

没有答案