Gday,
我们有两个包含完全相同结构的表。有两列“ PrimaryAddress”和“ AliasAddress”。这些用于电子邮件地址和别名。我们希望找到任何需要添加到 端的记录,以使记录保持同步。问题在于,一个表中的主要名称可能在另一个表中被列为别名。好消息是,地址不会在“ AliasAddress”列中出现两次。
TABLE A
PrimaryAddress~~~~~AliasAdress
chris@work~~~~~~~~~chris@home
chris@work~~~~~~~~~c@work
chris@work~~~~~~~~~theboss@work
chris@work~~~~~~~~~thatguy@aol
bob@test~~~~~~~~~~~test1@test
bob@test~~~~~~~~~~~charles@work
bob@test~~~~~~~~~~~chuck@aol
sally@mars~~~~~~~~~sally@nasa
sally@mars~~~~~~~~~sally@gmail
TABLE B
PrimaryAddress~~~~~AliasAdress
chris@home~~~~~~~~~chris@work
chris@home~~~~~~~~~c@work
chris@home~~~~~~~~~theboss@work
chris@home~~~~~~~~~thatguy@aol
bob@test~~~~~~~~~~~test1@test
bob@test~~~~~~~~~~~charles@work
sally@nasa~~~~~~~~~sally@mars
sally@nasa~~~~~~~~~sally@gmail
sally@nasa~~~~~~~~~ripley@nostromo
预期结果是从两个表中返回以下缺少的记录:
bob@test~~~~~~~~~~~chuck@aol
sally@nasa~~~~~~~~~ripley@nostromo
请注意,chris@*
块是完全匹配的,因为所有别名(加上主地址)的总和仍然相同,无论哪个地址被视为主地址。哪个地址是主要地址都没有关系,因为整个主要组的总和包含两个表中的所有条目。
我不介意是否通过A-> B和B-> A两遍运行,但是我只是无法解决问题。
任何帮助表示赞赏:)
答案 0 :(得分:0)
因此,您想比较表A和B,并查找在任何一个表中没有歧义的行。 outer join怎么样,然后寻找 NULL 值:
SELECT ta.*, tb.*
FROM table_a ta
FULL OUTER JOIN table_b tb ON tb.PrimaryAddress = ta.PrimaryAddress
AND tb.AliasAddress = ta.AliasAddress
WHERE ta.PrimaryAddress IS NULL
OR tb.PrimaryAddress IS NULL
如果我对问题的理解正确,这应该返回您的要求。
答案 1 :(得分:0)
drop TABLE #TABLEA
CREATE TABLE #TABLEA
([PrimaryAddress] varchar(10), [AliasAdress] varchar(12))
;
INSERT INTO #TABLEA
([PrimaryAddress], [AliasAdress])
VALUES
('chris@work', 'chris@home'),
('chris@work', 'c@work'),
('chris@work', 'theboss@work'),
('chris@work', 'thatguy@aol'),
('bob@test', 'test1@test'),
('bob@test', 'charles@work'),
('bob@test', 'chuck@aol'),
('sally@mars', 'sally@nasa'),
('sally@mars', 'sally@gmail')
;
drop TABLE #TABLEB
CREATE TABLE #TABLEB
([PrimaryAddress] varchar(10), [AliasAdress] varchar(15))
;
INSERT INTO #TABLEB
([PrimaryAddress], [AliasAdress])
VALUES
('chris@home', 'chris@work'),
('chris@home', 'c@work'),
('chris@home', 'theboss@work'),
('chris@home', 'thatguy@aol'),
('bob@test', 'test1@test'),
('bob@test', 'charles@work'),
('sally@nasa', 'sally@mars'),
('sally@nasa', 'sally@gmail'),
('sally@nasa', 'ripley@nostromo')
;
尝试以下
select a.PrimaryAddress,a.AliasAdress from #TABLEA a left join #TABLEB b on a.AliasAdress=b.AliasAdress or b.PrimaryAddress=a.AliasAdress
where b.PrimaryAddress is null
union all
select a.PrimaryAddress,a.AliasAdress from #TABLEB a left join #TABLEA b on a.AliasAdress=b.AliasAdress or b.PrimaryAddress=a.AliasAdress
where b.PrimaryAddress is null
答案 2 :(得分:0)
这就是我的方法,最后一点举起手来。
第一步,确定要比较的项目集。这是:
表(A或B)中的集合由其主值标识。真正困难的是,两个表(sally @ mars,sally @ nasa)之间没有共享主值。因此,我们可以比较集合,但我们必须能够分别“返回”每个表上的主表(例如,表B的突出显示可能是sally @ nasa / ripley @ nostroomo,但是我们必须添加sally @火星/ ripley @ nostromo到表A)
如果在表中一个主值显示为另一个主值的别名(例如,在表A中,chris @ work作为bob @ test的别名),则会出现主要问题。为了理智起见,我将假定不会发生这种情况……但是,如果确实如此,问题将变得更加棘手。
此查询用于在B中添加不在A中的缺失项,其中A和B的PrimaryAddress相同:
;WITH setA (SetId, FullSet)
as (-- Complete sets in A
select PrimaryAddress, AliasAdress
from A
union select PrimaryAddress, PrimaryAddress
from A
)
,setB (SetId, FullSet)
as (-- Complete sets in B
select PrimaryAddress, AliasAdress
from B
union select PrimaryAddress, PrimaryAddress
from B
)
,NotInB (Missing)
as (-- What's in A that's not in B
select FullSet
from setA
except select FullSet -- This is the secret sauce. Definitely worth your time to read up on how EXCEPT works.
from setB
)
-- Take the missing values plus their primaries from A and load them into B
INSERT B (PrimaryAddress, AliasAdress)
select A.PrimaryAddress, nB.Missing
from NotInB nB
inner join A
on A.AliasAdress = nb.Missing
以相反的顺序(从“ NotInB”开始)再次运行它,以对A进行相同操作。
如何
使用“不在B中的B中”的样本数据来执行此操作,会将(sally @ nasa,ripley @ nostromo)添加到A中,由于这是一个不同的主变量,因此会创建一个新集合,因此不会解决这个问题。它很快变得丑陋。从这里说出来:
答案 3 :(得分:0)
好的,这就是我们的工作方式...当它变得很痛苦时,我们运行了一个过程,将每个条目的主地址作为别名添加:xx @ xx-> xx @ xx,这样所有地址都被列为每个用户的别名。这类似于@Phillip Kelly所做的事情。然后,我们运行以下代码:(它很杂乱,但可以工作;也可以一遍)
SELECT 'Missing from B:' as Reason, TableA.[primary] as APrimary, TableA.[alias] as AAlias, TableB.[primary] as BPrimary,TableB.[alias] as BAlias into #A FROM dbo.TableA LEFT OUTER JOIN TableB ON TableB.alias = TableA.alias
SELECT 'Missing from A:' as Reason,TableA.[primary] as APrimary, TableA.[alias] as AAlias, TableB.[primary] as BPrimary,TableB.[alias] as BAlias into #B FROM dbo.TableB LEFT OUTER JOIN TableA ON TableA.alias = TableB.alias
select * from #A
select * from #B
UPDATE #A
SET #A.APrimary = #B.BPrimary
FROM #B INNER JOIN #A ON #A.APrimary = #B.BPrimary
WHERE #A.BPrimary IS NULL
UPDATE #B
SET #B.BPrimary = #A.APrimary
FROM #B INNER JOIN #A ON #B.BPrimary = #A.BPrimary
WHERE #B.APrimary IS NULL
select * from #A
select * from #B
select * into #result from (
select Reason, BPrimary as [primary], BAlias as [alias] from #B where APrimary IS NULL
union
select Reason, APrimary as [primary], AAlias as [alias] from #A where BPrimary IS NULL
) as tmp
select * from #result
drop table #A
drop table #B
drop table #result
GO