我有2个数据库具有相同的结构,但数据不同。两者都是SQL 2005。
我试图找到数据库B中存在的数据库中的哪些人。我最好的匹配机会是匹配FirstName和LastName。
我只想带回一份清单:
DatabaseA.Person DatabaseB.Person
其中: 1.我想要DatabaseA中的所有记录,即使数据库B中没有匹配项也是如此。 2.我只想要DatabaseB中的记录,其中FirstName / LastName只匹配DatabaseB中的一条记录。
我已经编写了一个查询,我将其分组,但由于我需要查看比FirstName和LastName更多的数据,因此我无法在不对其进行分组的情况下将其恢复 - 这给了我很多重复。我应该使用什么样的查询?我需要使用光标吗?
这是我现在的查询,哪种工作 - 除了我在DatabaseB中获得重复项的结果以及我想知道的关于数据库B的所有内容是FirstName / LastName与一个不同的记录匹配而没有其他记录。我的目标是获得一个我认识的人员列表,这些人员是2个数据库中的同一个人,这样我就可以建立员工之间部门代码映射的字典列表。
select
count(DatabaseAEmployee.id) as matchcount
, DatabaseAPerson.id as DatabaseAPersonid
, DatabaseAEmployee.DeptCode DatabaseADeptCode
, DatabaseAPerson.firstname as DatabaseAfirst
, DatabaseAPerson.lastname as DatabaseAlast
, DatabaseBPerson.id as DatabaseBPersonid
, DatabaseBEmployee.DeptCode as DatabaseBDeptCode
, DatabaseBPerson.firstname as DatabaseBfirst
, DatabaseBPerson.lastname as DatabaseBlast
, DatabaseAPerson.ssn as DatabaseAssn
, DatabaseBPerson.ssn as DatabaseBssn
, DatabaseAPerson.dateofbirth as DatabaseAdob
, DatabaseBPerson.dateofbirth as DatabaseBdob
FROM [DatabaseA].[dbo].Employee DatabaseAEmployee
LEFT OUTER JOIN [DatabaseA].[dbo].Person DatabaseAPerson
ON DatabaseAPerson.id = DatabaseAEmployee.id
LEFT OUTER JOIN [DatabaseB].[dbo].Person DatabaseBPerson
ON
DatabaseAPerson.firstname = DatabaseBPerson.firstname
AND
DatabaseAPerson.lastname = DatabaseBPerson.lastname
LEFT OUTER JOIN [DatabaseB].[dbo].Employee DatabaseBEmployee
on DatabaseBEmployee.id = DatabaseBPerson.id
group by
DatabaseAPerson.firstname
, DatabaseAPerson.lastname
, DatabaseAPerson.id
, DatabaseAEmployee.DeptCode
, DatabaseBPerson.id
, DatabaseBEmployee.DeptCode
, DatabaseBPerson.firstname
, DatabaseBPerson.lastname
, DatabaseBPerson.ssn
, DatabaseAPerson.ssn
, DatabaseBPerson.dateofbirth
, DatabaseAPerson.dateofbirth
以下是我现在正在尝试的内容,但我在左侧获得了重复内容:
with UniqueMatchedPersons (Id, FirstName, LastName)
as (
select
p2.ID, p2.FirstName, p2.LastName
from
[DatabaseA].[dbo].[Employee] p1
INNER JOIN [DatabaseA].[dbo].[Person] p2 on p1.id = p2.id
inner join [DatabaseB].[dbo].[Person] p3
on p2.FirstName = p3.FirstName and p2.LastName = p3.LastName
INNER JOIN [DatabaseB].[dbo].[Employee] p4
on p3.id = p4.id
group by p2.ID, p2.FirstName, p2.LastName
having count(p2.ID) = 1
)
select p1.*, p2.*
from DatabaseA.dbo.Person p1
inner join UniqueMatchedPersons on p1.ID = UniqueMatchedPersons.ID
left outer join DatabaseB.dbo.Person p2
on p1.FirstName = p2.FirstName and p1.LastName = p2.LastName
答案 0 :(得分:2)
试试这个:
SELECT id,FirstName,Lastname
FROM dba.Persons
UNION
SELECT b.id,b.FirstName,b.LastName
FROM dbb.Persons as b
INNER JOIN dba.Persons as a
ON b.FirstName = a.FirstName AND b.LastName = a.LastName
如果你想从A获得所有,而只有那些来自B的人没有匹配(这对我来说更有意义),我会用这个:
SELECT id,FirstName,Lastname
FROM dba.Persons
UNION
SELECT b.id,b.FirstName,b.LastName
FROM dbb.Persons as b
LEFT OUTER JOIN dba.Persons as a
ON b.FirstName = a.FirstName AND b.LastName = a.LastName
WHERE a.id is null
答案 1 :(得分:2)
尝试类似:
Select dta.LastName, dta.FirstName, dta.[otherColumns] dtb.LastName, dtb.FirstName
dtb.[otherColumns]
From [databaseA].[table] as dta
LEFT OUTER JOIN [databaseB].[table] as dtb
on dta.Lastname = dtb.LastName and dta.FirstName = dtb.FirstName
那应该得到你:1)表A中的每个人,以及2)表B中的每个人在表A中都有姓氏/名字匹配。
答案 2 :(得分:2)
在SQL Server(至少它应该)
时工作SELECT
A.*
, B.*
FROM
DatabaseA.dbo.Person A
LEFT JOIN DatabaseB.dbo.Person B
ON A.FirstName = B.FirstName AND A.LastName = B.LastName
修改:您提到您从DatabaseB收到重复项,您只需要在名字和姓氏上匹配。但是你也要求其他数据(然后是first / lastname)这就是问题所在。如果您使用不同的数据,则只需要该数据。
答案 3 :(得分:2)
使用transact-sql,以下未经测试的查询应该只允许您查看唯一匹配:
select
p1.ID, p1.FirstName, p1.LastName
from
[DatabaseA].[dbo].[Persons] p1
left outer join [DatabaseB].[dbo].[Persons] p2
on p1.FirstName = p2.FirstName and p1.LastName = p2.LastName
group by p1.ID, p1.FirstName, p2.LastName
having count(p1.ID) = 1
如果使用Sql Server,则可以将其封装在公共表表达式中,您可以对其执行连接。
with UniqueMatchedPersons (Id, FirstName, LastName)
as (
--query in previous code snippet
)
select persons.*
from Persons
inner join UniqueMatchedPersons on Persons.ID = UniqueMatchedPersons.ID
<强>更新强>
如果您希望从两个表中选择字段,您只需重新指定之前评估名称匹配的原始连接条件;这是因为联接左侧的重复匹配已被having
聚合条件过滤掉。
修改上述代码段的select
部分以阅读以下内容,您可以从联接的任一侧选择字段:
select p1.*, p2.*
from [DatabaseA].[dbo].[Persons] p1
inner join UniqueMatchedPersons on p1.ID = UniqueMatchedPersons.ID
left outer join [DatabaseB].[dbo].[Persons] p2
on p1.FirstName = p2.FirstName and p1.LastName = p2.LastName
更新2:
要过滤掉左侧的重复项(这也会导致右侧重复),您必须删除[DatabaseA].[dbo].[Persons].[ID]
上的分组。
当我提到重复项时,我指的是在字符和填充方面相邻行中相同的名称。如果您有名字和姓氏的变音变体,那么名称比较的结果将受数据库排序规则的约束(除非您明确声明对连接表达式的排序规则)。同样,如果名称之间的间距,填充或标点符号有变化,则可能需要考虑与直接相等运算符不同的方法进行名称匹配。
尝试以下方法:
with UniqueMatchedPersons (FirstName, LastName)
as (
select
p1.FirstName, p1.LastName
from
[DatabaseA].[dbo].[Person] p1
left outer join [DatabaseB].[dbo].[Person] p2
on p2.FirstName = p3.FirstName and p2.LastName = p3.LastName
group by p1.FirstName, p1.LastName
having count(p1.FirstName) = 1
)
select p1.*, p2.*, e1.*, e2.*
from [DatabaseA].[dbo].[Person] p1
inner join UniqueMatchedPersons ump
on p1.FirstName = ump.FirstName and p1.LastName = ump.LastName
left outer join [DatabaseB].[dbo].[Person] p2
on p1.FirstName = p2.FirstName and p1.LastName = p2.LastName
inner join [DatabaseA].[dbo].[Employee] e1 on p1.ID = e1.ID
inner join [DatabaseB].[dbo].[Employee] e2 on e2.ID = p2.ID
order by p1.id asc