我需要比较两个表数据并检查哪些属性不匹配,表有相同的表定义,但问题是我有一个唯一的比较密钥。我试着用
CONCAT(CONCAT(CONCAT(table1.A, Table1.B))
=CONCAT(CONCAT(CONCAT(table2.A, Table2.B))
但仍然面临重复的行也在几列上尝试了NVL,但没有工作
SELECT
UT.cat,
PD.cat
FROM
EM UT, EM_63 PD
WHERE
NVL(UT.cat, 1) = NVL(PD.cat, 1) AND
NVL(UT.AT_NUMBER, 1) = NVL(PD.AT_NUMBER, 1) AND
NVL(UT.OFFSET, 1) = NVL(PD.OFFSET, 1) AND
NVL(UT.PROD, 1) = NVL(PD.PROD, 1)
;
另一个表中35k记录中有34k记录,但如果我运行上述查询,则行数为3百万。
表格中的列:
COUNTRY
CATEGORY
TYPE
DESCRIPTION
示例数据:
表1:
COUNTRY CATEGORY TYPE DESCRIPTION
US C T1 In
IN A T2 OUT
B C T2 IN
Y C T1 INOUT
表2:
COUNTRY CATEGORY TYPE DESCRIPTION
US C T2 In
IN B T2 Out
Q C T2 IN
预期产出:
column Matched unmatched
COUNTRY 2 1
CATEGORY 2 1
TYPE 2 1
DESCRIPTION 3 0
答案 0 :(得分:2)
在最常见的情况下(当您可能有重复的行,并且您想要查看哪个行存在于一个表中而不存在于另一个表中时,以及哪些行可能存在于两个表中,但该行存在3次第一个表但另一个表5次:
这是一个非常普遍的问题,有一个稳定的最佳解决方案"由于某些原因,似乎大多数人仍然没有意识到,即使它是多年前在AskTom上开发的并且已经多次出现过。
您不需要加入,也不需要任何类型的唯一密钥,并且您不需要多次读取任何一个表。我们的想法是添加两列来显示每一行来自哪个表,执行UNION ALL,然后GROUP BY所有列除了" source"列并显示每个表的计数。像这样:
select count(t_1) as count_table_1, count(t_2) as count_table_2, col1, col2, ...
from (
select 'x' as t_1, null as t_2, col1, col2, ...
from table_1
union all
select null as t_1, 'x' as t_2, col1, col2, ...
from table_2
)
group by col1, col2, ...
having count(t_1) != count(t_2)
;
答案 1 :(得分:1)
从此查询开始,检查这4列是否构成密钥。
select occ_total,occ_ut,occ_pd
,count(*) as records
from (select count (*) as occ_total
,count (case tab when 'UT' then 1 end) as occ_ut
,count (case tab when 'PD' then 1 end) as occ_pd
from select 'UT' as tab,cat,AT_NUMBER,OFFSET,PROD from EM
union all select 'PD' ,cat,AT_NUMBER,OFFSET,PROD from EM_63 PD
) t
group by cat,AT_NUMBER,OFFSET,PROD
) t
group by occ_total,occ_ut,occ_pd
order by records desc
;
选择"键"后,您可以使用以下查询来查看属性'值
select count (*) as occ_total
,count (case tab when 'UT' then 1 end) as occ_ut
,count (case tab when 'PD' then 1 end) as occ_pd
,count (distinct att1) as cnt_dst_att1
,count (distinct att2) as cnt_dst_att2
,count (distinct att3) as cnt_dst_att3
,...
,listagg (case tab when 'UT' then att1 end) within group (order by att1) as att1_vals_ut
,listagg (case tab when 'PD' then att1 end) within group (order by att1) as att1_vals_pd
,listagg (case tab when 'UT' then att2 end) within group (order by att2) as att2_vals_ut
,listagg (case tab when 'PD' then att2 end) within group (order by att2) as att2_vals_pd
,listagg (case tab when 'UT' then att3 end) within group (order by att3) as att3_vals_ut
,listagg (case tab when 'PD' then att3 end) within group (order by att3) as att3_vals_pd
,...
from select 'UT' as tab,cat,AT_NUMBER,OFFSET,PROD,att1,att2,att3,... from E M
union all select 'PD' ,cat,AT_NUMBER,OFFSET,PROD,att1,att2,att3,... from EM_63 PD
) t
group by cat,AT_NUMBER,OFFSET,PROD
;
答案 2 :(得分:0)
CONCAT
的问题是,如果您的数据与此类似,则可能会收到无效匹配:
table1.A = '123'
table1.B = '456'
连接到:'123456'
table2.A = '12'
table2.B = '3456'
也加入:'123456'
您必须单独比较字段:table1.A = table2.A AND table1.B = table2.B