我有一个包含以下列的表[SQL Server 2008 R2]
LastName
DOB
Zip
Address
Phone
Email
我需要找到具有冲突LastName
和DOB
的所有相同内容(Zip
,Address
,Phone
,Email
)冲突的编辑距离。
我已经有edit_distance
的UDF,但我无法提出查询。
注意:我的表格中可以有两个以上的重复项。
答案 0 :(得分:1)
这就是你想要的:
select *
from table t join
table t2
on t.lastname = t2.lastname and t.dob = t2.dob and
t.zip = t2.zip and t.address = t2.zip
where edit_distance(t.phone, t2.phone) > @threshhold or
edit_distance(t.email, t2.email) > @threshhold;
答案 1 :(得分:0)
您可能需要自我加入。
select t1.*, edit_distance(t1.Phone, t2.Phone) EditDistancePhone, edit_distance(t1.Email, t2.Email)EditDistanceEmail from
(select LastName, DOB, Zip, Address, Phone, Email from Table1) t1
inner join
(select LastName, DOB, Zip, Address, Phone, Email from Table1) t2
on t1.LastName = t2.LastName
and t1.DOB = t2.DOB
and t1.Zip = t2.Zip
and t1.Address = t2.Address
and t1.Phone <> t2.Phone
and t1.Email <> t2.Email
关于UDF的更多细节肯定会帮助这里的人们提供更好的答案。