如何比较两个表

时间:2012-11-19 22:20:55

标签: sql sql-server sql-server-2008

我有两个表(OriginalLoad),我想使用存储过程对它们进行比较。

数据库是SQL Server 2008。

这是我的示例SP:

USE [TestDB]
GO

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

CREATE PROCEDURE [dbo].[ValidateLoad] 
AS 
    SET nocount ON; 

    IF EXISTS(SELECT * 
              FROM   dbo.original 
              EXCEPT 
              SELECT * 
              FROM   dbo.LOAD) 
      BEGIN 
          PRINT 
      'Warning! The following information has not been loaded' 

       PRINT '---------------------------------' 

       PRINT 'Load result: Fail' 

          SELECT * 
          FROM   dbo.original 
          EXCEPT 
          SELECT * 
          FROM   dbo.LOAD 

      set noexec on
      END 

    IF EXISTS(SELECT * 
              FROM   dbo.LOAD 
              EXCEPT 
              SELECT * 
              FROM   dbo.original) 
      BEGIN 
          PRINT 
      'Warning! The following information does not exist in Original table' 

       PRINT '---------------------------------' 

       PRINT 'Load result: Fail' 

    SELECT * 
    FROM   dbo.LOAD 
    EXCEPT 
    SELECT * 
    FROM   dbo.original 

    set noexec on
END 

    PRINT 'Load result: Succeeded' 

我觉得效率不高。

我的目的是验证这两个表/数据集是否相同,如果没有,则输出结果带有有意义的错误消息。

有什么想法吗?

谢谢。

3 个答案:

答案 0 :(得分:0)

您是否考虑过使用合并声明来比较两者。

不匹配时插入#temp。然后您可以显示临时表的结果。

答案 1 :(得分:0)

假设您至少有一个列c1具有NOT NULL约束,那么此查询应该为您提供一个表中缺少的所有行的列表:

SELECT t1.*,t2.* 
FROM dbo.original t1 full outer join dbo.LOAD t2
ON (list of join keys)
WHERE t1.c1 IS NULL or t2.c1 IS NULL

不确定与基于EXCEPT的查询相比有多高效,您需要尝试一下。

答案 2 :(得分:0)

一般情况下的答案:

SELECT * FROM tableA
UNION 
SELECT * FROM tableB
EXCEPT 
SELECT * FROM tableA 
INTERSECT
SELECT * FROM tableB;

详细答案:

Q值。 SQL代表什么字母? A.几乎没有资格成为一种语言...... ROFL但要考虑以下几点:

CREATE TABLE MyTable
   ( ID INTEGER NOT NULL UNIQUE, data_col VARCHAR(10) NOT NULL );

DECLARE @MyTable TABLE 
   ( ID INTEGER NOT NULL UNIQUE, data_col VARCHAR(10) NOT NULL );

现在问题:

Q值。在语言(计算机科学)术语中,@MyTable是变量吗? A.好吧......

问题1:无法将表格值分配给@MyTable,例如

-- Assignment attempt 1:
SET @MyTable = MyTable;  -- COMPILE ERROR

-- Assignment attempt 2:
SET @MyTable = ( VALUES ( 1, NULL ), ( 2, '' ), ( 3, 'Test' ) );  -- COMPILE ERROR

问题2:无法比较变量,例如

-- Comparison attempt 1:
IF ( @MyTable = @MyTable ) BEGIN; 
    PRINT 'Tables are the same.' 
END;  -- COMPILE ERROR

-- Comparison 2:
IF ( @MyTable = ( VALUES ( 1, NULL ), ( 2, '' ), ( 3, 'Test' ) ) ) BEGIN; 
    PRINT 'Tables are the same.' 
END;  -- COMPILE ERROR

...所以我们必须相信@MyTable是一个变量'既不支持也不支持比较。

Q值。如果@MyTable是变量,那么语言(计算机科学)术语是MyTable?  A.康斯坦特?值?类型?类?结构?以上都不是?

...是的,SQL确实是一种非常奇怪的语言!

Q值。什么是关系运算符?  A.在关系模型中,它是一个运算符,它将两个关系值作为参数,并返回一个关系值作为结果。

Q值。 SQL是否支持关系运算符?  A.不完全是。 SQL确实拥有熟悉真正关系语言(UNIONINTERSECTEXCEPT等用户的用户的运算符。但是,SQL支持非关系功能,最明显的是空值,重复行和重复列名称。因此,需要非常小心以确保这些运算符的参数和结果等同于关系。

问:如何使用SQL'关系式'来比较两个表的相等性?运营商? 这是一种方式:

SELECT * FROM tableA
UNION 
SELECT * FROM tableB
EXCEPT 
SELECT * FROM tableA 
INTERSECT
SELECT * FROM tableB;

测试以演示上述内容(注意以下所有关系值,但确实运算符在逻辑上与SQL空值一起工作):

示例1:表格相同(期望零行== PASS):

WITH tableA AS
     ( SELECT * FROM ( VALUES ( 1, NULL ), 
                              ( 2, '' ), 
                              ( 3, 'Test' )
                     ) AS T ( ID, data_col ) ),
     tableB AS
     ( SELECT * FROM ( VALUES ( 1, NULL ), 
                              ( 2, '' ), 
                              ( 3, 'Test' )
                     ) AS T ( ID, data_col ) )
SELECT * FROM tableA
UNION 
SELECT * FROM tableB
EXCEPT 
SELECT * FROM tableA 
INTERSECT
SELECT * FROM tableB;

示例2:tableB是tableB的正确子集(期望rows == FAIL):

WITH tableA AS
     ( SELECT * FROM ( VALUES ( 1, NULL ), 
                              ( 2, '' ), 
                              ( 3, 'Test' )
                     ) AS T ( ID, data_col ) ),
     tableB AS
     ( SELECT * FROM ( VALUES ( 1, NULL ), 
                              ( 2, '' ) 
                     ) AS T ( ID, data_col ) )
SELECT * FROM tableA
UNION 
SELECT * FROM tableB
EXCEPT 
SELECT * FROM tableA 
INTERSECT
SELECT * FROM tableB;

示例3:tableA是tableB的正确子集(期望rows == FAIL):

WITH tableA AS
     ( SELECT * FROM ( VALUES ( 1, NULL ), 
                              ( 3, 'Test' )
                     ) AS T ( ID, data_col ) ),
     tableB AS
     ( SELECT * FROM ( VALUES ( 1, NULL ), 
                              ( 2, '' ), 
                              ( 3, 'Test' )
                     ) AS T ( ID, data_col ) )
SELECT * FROM tableA
UNION 
SELECT * FROM tableB
EXCEPT 
SELECT * FROM tableA 
INTERSECT
SELECT * FROM tableB;

示例4:tableA和tableB有一些共同但不是所有的行值(期望行==失败):

WITH tableA AS
     ( SELECT * FROM ( VALUES ( 1, NULL ), 
                              ( 4, 'Lone' )
                     ) AS T ( ID, data_col ) ),
     tableB AS
     ( SELECT * FROM ( VALUES ( 1, NULL ), 
                              ( 4, 'Sole' )
                     ) AS T ( ID, data_col ) )
SELECT * FROM tableA
UNION 
SELECT * FROM tableB
EXCEPT 
SELECT * FROM tableA 
INTERSECT
SELECT * FROM tableB;

示例5:tableA和tableB没有共同的行值(期望行== FAIL):

WITH tableA AS
     ( SELECT * FROM ( VALUES ( 5, NULL ), 
                              ( 6, 'Different' )
                     ) AS T ( ID, data_col ) ),
     tableB AS
     ( SELECT * FROM ( VALUES ( 7, NULL ), 
                              ( 8, 'Not the same' )
                     ) AS T ( ID, data_col ) )
SELECT * FROM tableA
UNION 
SELECT * FROM tableB
EXCEPT 
SELECT * FROM tableA 
INTERSECT
SELECT * FROM tableB;

Q值。为什么SQL Server DBA倾向于不使用这种语法,而更喜欢FULL OUTER JOIN? 答:可能出于各种原因,例如熟悉遗留语法(例如SQL Server 2005中引入了EXCEPT)。但最有可能的是,SQL DBA倾向于想要编写他们认为最有效的查询(贬义,过早优化)。确实,SQL Server优化器无法很好地处理运算符INTERSECTEXCEPT

Q值。为什么喜欢“关系式”'运营商? 答:因为它们不那么冗长,而且更容易阅读。这两个都是测试代码的良好品质。