我正在寻找一个查询,用于比较具有相同列名的两个表的宏,并输出每个字段的百分比匹配。我希望宏将表名作为输入。
ex为两个静态表。
SELECT
SUM(CASE WHEN table1.field1 = table2.field1 THEN 1 ELSE 0 END)/SUM(1)
,SUM(CASE WHEN table1.field2 = table2.field2 THEN 1 ELSE 0 END)/SUM(1)
,SUM(CASE WHEN table1.field3 = table2.field3 THEN 1 ELSE 0 END)/SUM(1)
....
....
,SUM(CASE WHEN table1.fieldN = table2.fieldN THEN 1 ELSE 0 END)/SUM(1)
FROM table1
INNER JOIN table2
ON table1.keyField = table2.keyField
是否可以编写一个概括它的宏?
例如,psuedo查询可能如下所示:
CREATE MACRO compareTables (table1 varChar(50),table2 varChar(50),keyField AS varchar(50)) AS (
WITH sharedColumns (columnName) AS (
SELECT columnname
FROM dbc.columns
WHERE
tableName = :table1
INTERSECT
SELECT columnname
FROM dbc.columns
WHERE
tableName = :table2)
SELECT
SUM (
CASE
WHEN :table1.<sharedColumn[1]> = :table2.<sharedColumn[1]
THEN 1
ELSE 0
END)/SUM(1)
....
SUM (
CASE
WHEN :table1.<sharedColumn[N]> = :table2.<sharedColumn[N]
THEN 1
ELSE 0
END)/SUM(1)
FROM :table1
INNER JOIN :table2
ON :table1.:keyField = :table2.:keyField;);
有没有办法在没有UDF的Teradata中完成此任务(我没有创建功能权限)。如果这是唯一的方式,那么我可以提出请求,但我不愿意,如果可以避免。
答案 0 :(得分:1)
我通常不会在这里写出所有代码,但是当我想到这个时,它听起来很有趣。
就像我在您的问题的评论中提到的那样,您不能在宏中执行此操作,因为您不能将宏参数用作数据库对象。它们仅适用于数据中的值。所以:
Select * From Table Where F1= :myparam;
在宏中很酷,但是:
Select * From Table Where :myparam = 'somevalue';
是不允许的。
但是,您可以使用存储过程或任何您喜欢的脚本语言来执行此操作。
问题是你有两个问题。
这些要求都不是微不足道的,但以下内容应该为您完成这项工作。它可能需要一些调整,但我认为它很接近:
CREATE PROCEDURE compareTables
(
IN table1 varChar(50),
IN table2 varChar(50),
IN keyField varchar(50),
OUT dynamicallyCreatedSQL VARCHAR(10000)
)
DYNAMIC RESULT SETS 1
BEGIN
DECLARE outputSQLStatement VARCHAR(10000); --variable to hold your dynamically created sql statement that will produce the record set that we are outputting form this SP
DECLARE columnSQLStatement VARCHAR(500); --variable to hold your dynamically created sql statement that will hold the columns in Table1
DECLARE columnName VARCHAR(30); --Variable to stick the column name that we get from the column_cursor
DECLARE output_cursor CURSOR WITH RETURN ONLY FOR output_statement; --The dynamically created cursor that will hold your record set produced by outputSQLStatement
DECLARE column_cursor CURSOR FOR column_statement; --The dynamically created cursor that will hold your record set produced by columnSQLStatement
--The start of your dynamic output sql statement:
SET outputSQLStatement = '
SELECT ';
--SQL Statement for your dynamically created cursor to get the columns for your table
--TODO: Change "YourDatabaseHere" to your database...
SET columnSQLStatement = 'SELECT ColumnName FROM "DBC".Columns WHERE DATABASE=''YourDatabaseHere'' AND TableName=''' || table1 || ''';';
--Prepare the dynamically generated column SQL statement for cursor.
Prepare column_statement FROM columnSQLStatement;
--Open the cursor and Loop through each record
OPEN column_cursor;
LABEL1:
LOOP
--WOAH THERE! No data was returned. Much sorries.
-- If there is no data, this thing is going to hang...
-- And... if there is no data, it means that your table probably isn't a table. You should check your parameters.
IF (SQLSTATE ='02000') THEN
LEAVE label1;
END IF;
--Grab the column name from the record into the variable columnName
FETCH column_cursor INTO columnName;
--Now we can build the meat of that sql statement
SET outputSQLStatement = outputSQLStatement || '
SUM (
CASE
WHEN ' || table1 || '."' || columnName || '" = ' || table2 || '."' || columnName || '"
THEN 1
ELSE 0
END)/SUM(1) as "' || columnName || '",';
--End the loop and close the cursor
END LOOP LABEL1;
CLOSE column_cursor;
--There's going to be an extra comma in there that we have to remove before the FROM part of the SQL statement, lets get rid of that:
SET outputSQLStatement = Substring(OutputSQLStatement FROM 1 FOR Length(OutputSQLStatement) - 1);
--Now complete the sql statement
Set outputSQLStatement = outputSQLStatement || '
FROM ' || table1 || '
INNER JOIN ' || table2 || '
ON ' || table1 || '.' || keyfield || ' = ' || table2 || '.' || keyfield || ';';
--Set the output variable to the dynamically generated sql statement for debug fun.
Set dynamicallyCreatedSQL = outputSQLStatement;
--And finally... execute the statement by prepping it and opening the cursor.
-- we don't close the cursor so that the "Dynamic Result Sets 1" catches it and returns it to whatever calls this procedure.
PREPARE output_statement FROM outputSQLStatement;
OPEN output_cursor;
END;
你可以这样称呼:
CALL compareTables('table1', 'table2', 'yourkeyfield', output);
这将返回两个记录集。第一个将具有可用于调试的动态创建的SQL语句。第二个是你要追踪的记录集。
如果您没有CREATE PROCEDURE
访问权限,那么这是一次清洗。但是,无论如何,这将是您必须使用的方法,无论是在Teradata SP中,使用BTEQ进行bash,还是通过ADO / ODBC等任何其他脚本语言(如VBScript)。
我试着对它进行评论,因此每个部分都有解释,但是为了两个不同的目的使用游标之间发生了一些复杂的事情(循环遍历结果集,并打开结果集以便从过程输出)和根据{{1}}中的输入和列生成动态sql。