检查表的内部联接包含不按预期工作的相同值

时间:2013-02-20 01:55:38

标签: sql duplicates inner-join

SELECT COUNT(1) FROM own.no_preselection_1_a;
SELECT COUNT(1) FROM own.no_preselection_1;

SELECT COUNT(1) FROM
  (SELECT DISTINCT * FROM own.no_preselection_1_a
  );

SELECT COUNT(1) FROM
  (SELECT DISTINCT * FROM own.no_preselection_1
  );

SELECT COUNT(1)
FROM OWN.no_preselection_1 t1
INNER JOIN OWN.no_preselection_1_a t2
ON t1.number       = t2.number
AND t1.location_number = t2.location_number;

返回:

COUNT(1)               
---------------------- 
398                    

COUNT(1)               
---------------------- 
398                    

COUNT(1)               
---------------------- 
308                    

COUNT(1)               
---------------------- 
308                    

COUNT(1)               
---------------------- 
578                    

如果我们在这里查看联接的可视化解释:http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html

问题出在最后一个查询上。我原以为如果集合是相同的(即完美重叠),那么内部联接将返回原始集合大小的集合。

每个重复项是否为彼此的所有创建条目的问题是什么? (例如,如果每个表上有3个相同值的dupes,它会为它创建3x3 = 9个条目吗?) 这里有什么解决方案? (只需选择要进行内连接的区别?)这是检查两个表是否包含相同数据的良好测试吗?

1 个答案:

答案 0 :(得分:1)

您的表格中有重复项,因为列表中的第一个和第三个,第二个和第四个计数清楚了。

联接正在按预期工作,因此没有“问题”。你想达到什么目的?加入不满足你的目标。

我建议你用一些实际数据和你想要的结果来注释你的问题。

如果要显示两个表具有相同的值,可以尝试联合。假设两个表中的所有列都相同,并且行中的列唯一标识每一行:

select t.*
from ((select '1' as which, t.*
       from OWN.no_preselection_1 t
      ) union all
      (select '1-a' as which, t.*
       from OWN.no_preselection_1_a
      )
     ) t
group by < all the columns in the tables >
having count(*) <> 1

如果您仅限于这两列,并想查看是否有相应的条目(带有重复项),则以下方法有效:

select t.*
from ((select '1' as which, number, location_number,
              row_number() over (partition by number, location_number order by number) as seqnum
       from OWN.no_preselection_1 t
      ) union all
      (select '1-a' as which, number, location_number,
              row_number() over (partition by number, location_number order by number) as seqnum
       from OWN.no_preselection_1_a
      )
     ) t
group by number, location_number, seqnum
having count(*) <> 1