以特定字符串开头的所有表上的UNION ALL

时间:2019-11-06 05:54:26

标签: snowflake-data-warehouse

我想将以相同名称开头的表合并到一个表中。 例如,假设我有一个数据库,其中包含表“ EXT_ABVD”,“ EXT_ADAD”,“ EXT_AVSA”,“ OTHER”,并且我想合并所有以“ EXT_”开头的表,我想要的结果是

select col1 ,col2 from EXT_ABVD
union all
select col1 ,col2 from EXT_ADAD
union all
select col1 ,col2 from EXT_AVSA;

我想定期执行此操作(例如每天),并且每次运行时,可能会有新的表以“ EXT_”开头。我不想手动更新union_all查询。

我是Snowflake的新手,不知道该怎么办?我可以在Snowflake中使用脚本吗?

2 个答案:

答案 0 :(得分:3)

给出这些表:

CREATE TABLE TEST_DB.PUBLIC.EXT_ABVD (col1 INTEGER, col2 INTEGER);
CREATE TABLE TEST_DB.PUBLIC.EXT_ADAD (col1 INTEGER, col2 INTEGER);
CREATE TABLE TEST_DB.PUBLIC.EXT_ADAQ (col1 INTEGER, col2 INTEGER);

可以动态创建这样的视图:

CREATE OR REPLACE VIEW TEST_DB.PUBLIC.union_view AS 
SELECT * FROM TEST_DB.PUBLIC.EXT_ABVD
 UNION ALL 
SELECT * FROM TEST_DB.PUBLIC.EXT_ADAD
 UNION ALL 
SELECT * FROM TEST_DB.PUBLIC.EXT_ADAQ

使用此过程:

create or replace procedure TEST_DB.PUBLIC.CREATE_UNION_VEIW(TBL_PREFIX VARCHAR)
  returns VARCHAR -- return final create statement
  language javascript
  as     
  $$
    // build query to get tables from information_schema
    var get_tables_stmt = "SELECT Table_Name FROM TEST_DB.INFORMATION_SCHEMA.TABLES \
            WHERE TABLE_TYPE = 'BASE TABLE' AND TABLE_NAME LIKE '"+ TBL_PREFIX + "%';"

    var get_tables_stmt = snowflake.createStatement({sqlText:get_tables_stmt });

    // get result set containing all table names
    var tables = get_tables_stmt.execute();

    // to control if UNION ALL should be added or not
    // this could likely be handled more elegantly but i don't know JavaScript :)
    var row_count = get_tables_stmt.getRowCount();
    var rows_iterated = 0; 

    // define view name
    var create_statement = "CREATE OR REPLACE VIEW TEST_DB.PUBLIC.union_view AS \n";

    // loop over result set to build statement
    while (tables.next())  {
        rows_iterated += 1;

        // we get values from the first (and only) column in the result set
        var table_name = tables.getColumnValue(1); 

        // this will obviously fail if the column count doesnt match
        create_statement += "SELECT * FROM TEST_DB.PUBLIC." + table_name 

        // add union all to all but last row
        if (rows_iterated < row_count){
            create_statement += "\n UNION ALL \n"
        }
     }

    // create the view
    var create_statement = snowflake.createStatement( {sqlText: create_statement} );
    create_statement.execute();

    // return the create statement as text
    return create_statement.getSqlText();
  $$
  ;

我们将这样称呼:CALL CREATE_UNION_VIEW('EXT_A');

这只是一个基本示例,因此可能需要添加列数,架构等的逻辑。但是鉴于此,我认为您将能够弄清楚如何处理结果集,参数和语句。

编辑:有关如何设置每天运行一个过程的任务,请参见here。在这种情况下,最基本的将如下所示:

create or replace task create_union_task
  warehouse = COMPUTE_WH
  schedule = '1440 minute' -- once every day
as
  CALL CREATE_UNION_VIEW('EXT_A');

答案 1 :(得分:2)

目前唯一可以实现此目标的方法是通过Snowflake Stored Procedure

您没有指定要如何使用查询结果,但是一种方便的方法是通过VIEW。因此,存储过程必须生成一个VIEW定义,其中包含问题中的查询。