SQL - 仅选择冲突的值

时间:2015-03-06 14:36:59

标签: sql sql-server tsql

我有一个包含以下列的表[SQL Server 2008 R2]

  • LastName
  • DOB
  • Zip
  • Address
  • Phone
  • Email

我需要找到具有冲突LastNameDOB的所有相同内容(ZipAddressPhoneEmail)冲突的编辑距离。

我已经有edit_distance的UDF,但我无法提出查询。

注意:我的表格中可以有两个以上的重复项。

2 个答案:

答案 0 :(得分:1)

这就是你想要的:

select *
from table t join
     table t2
     on t.lastname = t2.lastname and t.dob = t2.dob and
        t.zip = t2.zip and t.address = t2.zip
where edit_distance(t.phone, t2.phone) > @threshhold or
      edit_distance(t.email, t2.email) > @threshhold;

答案 1 :(得分:0)

您可能需要自我加入。

select t1.*, edit_distance(t1.Phone, t2.Phone) EditDistancePhone, edit_distance(t1.Email, t2.Email)EditDistanceEmail from 
(select LastName, DOB, Zip, Address, Phone, Email from Table1) t1
inner join 
(select LastName, DOB, Zip, Address, Phone, Email from Table1) t2
on t1.LastName = t2.LastName
and t1.DOB = t2.DOB
and t1.Zip = t2.Zip
and t1.Address = t2.Address
and t1.Phone <> t2.Phone
and t1.Email <> t2.Email

关于UDF的更多细节肯定会帮助这里的人们提供更好的答案。