Question

我正在寻找一个查询，用于比较具有相同列名的两个表的宏，并输出每个字段的百分比匹配。我希望宏将表名作为输入。

ex为两个静态表。

SELECT 
 SUM(CASE WHEN table1.field1 = table2.field1 THEN 1 ELSE 0 END)/SUM(1)
,SUM(CASE WHEN table1.field2 = table2.field2 THEN 1 ELSE 0 END)/SUM(1)
,SUM(CASE WHEN table1.field3 = table2.field3 THEN 1 ELSE 0 END)/SUM(1)
....
....
,SUM(CASE WHEN table1.fieldN = table2.fieldN THEN 1 ELSE 0 END)/SUM(1)
FROM table1
INNER JOIN table2
    ON table1.keyField = table2.keyField

是否可以编写一个概括它的宏？

例如，psuedo查询可能如下所示：

CREATE MACRO compareTables (table1 varChar(50),table2 varChar(50),keyField AS varchar(50)) AS (
    WITH sharedColumns (columnName) AS (
        SELECT columnname
        FROM dbc.columns
        WHERE
            tableName = :table1
        INTERSECT
        SELECT columnname
        FROM dbc.columns
        WHERE
            tableName = :table2)
  SELECT 
   SUM (
       CASE 
           WHEN :table1.<sharedColumn[1]> = :table2.<sharedColumn[1]
           THEN 1
           ELSE 0
       END)/SUM(1)
  ....
   SUM (
       CASE 
           WHEN :table1.<sharedColumn[N]> = :table2.<sharedColumn[N]
           THEN 1
           ELSE 0
       END)/SUM(1)
  FROM :table1
  INNER JOIN :table2
      ON :table1.:keyField = :table2.:keyField;);

有没有办法在没有UDF的Teradata中完成此任务（我没有创建功能权限）。如果这是唯一的方式，那么我可以提出请求，但我不愿意，如果可以避免。

Answer 1

我通常不会在这里写出所有代码，但是当我想到这个时，它听起来很有趣。

就像我在您的问题的评论中提到的那样，您不能在宏中执行此操作，因为您不能将宏参数用作数据库对象。它们仅适用于数据中的值。所以：

 Select * From Table Where F1= :myparam;

在宏中很酷，但是：

 Select * From Table Where :myparam = 'somevalue';

是不允许的。

但是，您可以使用存储过程或任何您喜欢的脚本语言来执行此操作。

问题是你有两个问题。

您需要一个表列，这些列必须在创建比较查询时使用。
您必须根据列列表和包含表名的两个参数以及一个包含关键字段的参数动态构建比较查询。

这些要求都不是微不足道的，但以下内容应该为您完成这项工作。它可能需要一些调整，但我认为它很接近：

CREATE PROCEDURE compareTables 
(
    IN table1 varChar(50),
    IN table2 varChar(50),
    IN keyField varchar(50),
    OUT dynamicallyCreatedSQL VARCHAR(10000)
) 
DYNAMIC RESULT SETS 1

BEGIN

    DECLARE outputSQLStatement VARCHAR(10000); --variable to hold your dynamically created sql statement that will produce the record set that we are outputting form this SP
    DECLARE columnSQLStatement VARCHAR(500); --variable to hold your dynamically created sql statement that will hold the columns in Table1
    DECLARE columnName VARCHAR(30); --Variable to stick the column name that we get from the column_cursor  
    DECLARE output_cursor CURSOR WITH RETURN ONLY FOR output_statement; --The dynamically created cursor that will hold your record set produced by outputSQLStatement
    DECLARE column_cursor CURSOR FOR column_statement; --The dynamically created cursor that will hold your record set produced by columnSQLStatement

    --The start of your dynamic output sql statement:
    SET outputSQLStatement = '
                SELECT ';   

    --SQL Statement for your dynamically created cursor to get the columns for your table
    --TODO: Change "YourDatabaseHere" to your database...
    SET columnSQLStatement = 'SELECT ColumnName FROM "DBC".Columns WHERE DATABASE=''YourDatabaseHere'' AND TableName=''' || table1 || ''';';

    --Prepare the dynamically generated column SQL statement for cursor.
    Prepare column_statement FROM columnSQLStatement;

    --Open the cursor and Loop through each record
    OPEN column_cursor;
    LABEL1: 
    LOOP

        --WOAH THERE! No data was returned. Much sorries.
        -- If there is no data, this thing is going to hang...
        -- And... if there is no data, it means that your table probably isn't a table. You should check your parameters.
        IF (SQLSTATE ='02000') THEN
            LEAVE label1;
        END IF;

        --Grab the column name from the record into the variable columnName
        FETCH column_cursor INTO columnName;

        --Now we can build the meat of that sql statement       
        SET outputSQLStatement = outputSQLStatement || '
                SUM (
                   CASE 
                       WHEN ' || table1 || '."' || columnName || '" = ' || table2 || '."' || columnName || '"
                       THEN 1
                       ELSE 0
                   END)/SUM(1) as "' || columnName || '",';

    --End the loop and close the cursor
    END LOOP LABEL1;
    CLOSE column_cursor;        

    --There's going to be an extra comma in there that we have to remove before the FROM part of the SQL statement, lets get rid of that:
    SET outputSQLStatement = Substring(OutputSQLStatement FROM 1 FOR Length(OutputSQLStatement) - 1);

    --Now complete the sql statement        
    Set outputSQLStatement = outputSQLStatement || '
                FROM ' || table1 || '
                INNER JOIN ' || table2 || '
                  ON ' || table1 || '.' || keyfield || ' = ' || table2 || '.' || keyfield || ';';

    --Set the output variable to the dynamically generated sql statement for debug fun.
    Set dynamicallyCreatedSQL = outputSQLStatement;

    --And finally... execute the statement by prepping it and opening the cursor. 
    --  we don't close the cursor so that the "Dynamic Result Sets 1" catches it and returns it to whatever calls this procedure.
    PREPARE output_statement FROM outputSQLStatement;
    OPEN output_cursor;

END;

你可以这样称呼：

CALL compareTables('table1', 'table2', 'yourkeyfield', output);

这将返回两个记录集。第一个将具有可用于调试的动态创建的SQL语句。第二个是你要追踪的记录集。

如果您没有CREATE PROCEDURE访问权限，那么这是一次清洗。但是，无论如何，这将是您必须使用的方法，无论是在Teradata SP中，使用BTEQ进行bash，还是通过ADO / ODBC等任何其他脚本语言（如VBScript）。

我试着对它进行评论，因此每个部分都有解释，但是为了两个不同的目的使用游标之间发生了一些复杂的事情（循环遍历结果集，并打开结果集以便从过程输出）和根据{{1}}中的输入和列生成动态sql。

参数化SELECT语句

1 个答案: